Michael Djorup

A link to the code - https://github.com/mdjorup/gpt-comment

Overview

GPT-Comment is an automated essay review and feedback tool. It’s effectively a script that reads essays and other information from a spreadsheet, uses large language models and other techniques to generate automatic feedback, then writes back the results to a commented Google Doc.

I built this system for a friend who owns a college & high school prep business. His job involves a lot of essay reviewing and feedback, and already had a system in place through Google Classroom to manage and organize the massive essay volume.

This exactly lead to the next level of the problem: it takes so much time to give quality feedback on essays and his capacity was limited. He naturally hired other humans to review the essays, but for a set of 5 essays it would cost him about $7 in labor costs and the quality was inconsistent.

As it turns out, essay reviewing is very similar across different essays, and there are typically a lot of errors and suggestions that are easy to target. Generative AI with LLMs has been getting pretty good, so we thought we should give it a shot at automating part of the process.

How to Review an Essay using AI

I determined that there were 4 main components of a review that were reasonable for an AI to do:

A general comment on the essay as a whole

Anchored comments (tied to a specific section of text) on grammar errors

Anchored comments on content or stylistic suggestions

Automated fixes (e.g. removing double spaces)

It’s difficult for LLMs to focus on multiple relatively complex things at the same time. Therefore, there were different strategies for each component.

How it works

Input from Google Sheets

My friend’s current system had his clients fill out a Google Form with their essays and other information that would then get dumped into a Google Sheet. (In the future we may remove the layer of Google sheets entirely, and have the input come directly from the Google Form, but the Google Sheet is a nice “database” for now). Here’s the code that I wrote to load the Google Sheet into a pandas DataFrame.


def get_rows(self):
        result = (
            self.service.spreadsheets()
            .values()
            .get(spreadsheetId=self.spreadsheet_id, range=f"{self.sheet_name}!A1:Z")
            .execute()
        )
        data = result.get("values", [])
        columns = data[0]
        raw_rows = data[1:]
        rows = []
        for row in raw_rows:
            new_row = row[: len(columns)]
            if len(new_row) < len(columns):
                new_row += [""] * (len(columns) - len(new_row))
            rows.append(new_row)
        df = pd.DataFrame(rows, columns=columns)

        return df

The student’s essays would be in certain columns specified in config.py .

The next step was to build a custom ORM to convert each entry in the spreadsheet into it’s own StudentEntry object. This includes converting each essay into it’s own Essay object

Essay Processing

Each essay has an asynchronous process method which is the entry point for handling every processing task.

Here’s roughly the process method for each essay:


async def process(self, progress_bar=None):
        self.remove_double_spaces()
        self.generate_contraction_comments(config["contractions_data_path"])
        self.generate_second_person_comments(config["second_person_data_path"])

        async with asyncio.TaskGroup() as tg:
            tg.create_task(self.generate_general_comment())
            tg.create_task(self.generate_grammar_comments())
            tg.create_task(self.generate_specific_comments())

This just goes through and makes all the review types for the essay that we want. Anything that uses an LLM (in the tg ) gets processed in parallel to speed up the execution time.

For a deeper dive, here’s the code for processing tasks that generates specific comments:


async def generate_specific_comments(self):

        system_message = 'As an essay counselor, your task is to ... (cut off for space)'

        paragraphs: list[str] = self.text.split("\n")

        async with asyncio.TaskGroup() as tg:

            tasks = []
            for paragraph in paragraphs:
                if len(paragraph) <= 300:
										continue
                task = tg.create_task(OAI_SERVICE.generate_chat_completion(system_message,
                                                                           paragraph,
                                                                           "gpt-4",
                                                                           max_tokens=80,
                                                                           temperature=0.5))
                tasks.append(task)

            for coro in asyncio.as_completed(tasks):
                completion, cost = await coro
                self.processing_costs += cost
                unparsed_comments = completion.split("\n")
                self.add_unparsed_comments(unparsed_comments)

One thing that I found is that processing on these essays (while still pretty small) was way worse with larger inputs. In searching for a way to reduce the size of the essay inputted to the LLM, I decided to break up each essay and process paragraphs independently of each other. To reduce the amount of time the processing took, all of the processing tasks happen in parallel.

Bypassing the Google Docs Commenting API & Generating Reports

Google’s API’s can get somewhat frustrating just because they are incomplete. In this particular case, they don’t allow people to programmatically generate anchored comments on Google Docs. So I had to get creative because my friend relies on Google Docs as a key part of his workflow.

When we were thinking of other options, I asked if it was okay if I uploaded Microsoft Word documents to Google Drive instead of a Google Doc because I had found a way to programmatically generate Word docs with comments (HUGE thanks to bayoo-docx ).

Then we realized that if we could export a Word Doc as a Google Doc, then we could upload a real Google Doc to Google Drive. This was the key workaround.


from googleapiclient.http import MediaFileUpload

def upload_word_doc(self, document_name, document):
        bytes_io = io.BytesIO()
        document.save(bytes_io)

        file_metadata = {"name": document_name, "parents": [self.folder_id]}

        with tempfile.NamedTemporaryFile(delete=False) as temp_file:
            temp_file.write(bytes_io.getvalue())
            temp_file.flush()

            media = MediaFileUpload(
                temp_file.name,
                mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document",
                resumable=True,
            )

            file = (
                self.service.files()
                .create(media_body=media, body=file_metadata, fields="id")
                .execute()
            )

            file_id = file.get("id", "")

        os.unlink(temp_file.name)

        link = f"https://docs.google.com/document/d/{file_id}"

        return link

Here it basically extracts the raw file data and reconstructs the file (given the mimetype) into a MediaFileUpload object. Turns out when you upload something like a document to Google Drive, it automatically will make it a Google Doc. Problem solved.

Update Google Sheets

The final thing to do was to update the original Google Sheet to reflect that the entry had been processed. This step filled out an entry that indicated whether or not the essay had been processed, then it also filled in an entry with the Drive link to the Google Doc. Nothing too fancy here but it definitely was necessary to ensure that entries weren’t processed multiple times and just to integrate well with the workflow as a whole.