Using chatGPT to evaluate answers.

I was playing with chatGPT and was curious to know if it could evaluate the answer to the given question and give me the marks based on the answer accuracy.

This could help students evaluate their answers and receive an immediate response on their performance.

I started generating a prompt which took so many trials to get what I want from the chatGPT. It is sort of a pseudo-code:

I'll give you a question, the number of marks for that question and a solution given by a student.
I know you can complete this task, so please don't give me any excuses.

then the answer is the correct answer to the question.
compare my given solution and your answer.
find what's different in my solution.

if the question is to write an essay, article, post or any other question consisting of writing skills, then
"minwords" is the number of words the solution should contain which is given in the question.
then assist me in the language-related task, like checking mistakes in the solution.
If found,
"mistakes" is the percentage of words having mistakes related to the language in the solution to the total number of words in the solution.
then "correctness" is 100 - mistakes.
"related" is the percentage of the solution of the essay related to the given essay topic.
if the amount of words in the solution is less than the number of words it should contain which is given in the question,
then "haswords" is the percentage of given words to the number of words stated in the question that the solution should have. then,
accuracy is the average value of related, mistakes and haswords.

otherwise
calculate the percentage similarity in the answers.
based on the number of marks in the given question, calculate the rounded number of how many marks student got.

then give me back the following information:
"answer": if the solution is empty give me the correct answer to the question
otherwise
the complete correct answer step by step.

if my answer is incorrect
"what's wrong": what's different in my answer

"marks": if the solution is empty then I got 0 marks
otherwise
the rounded number of how many marks I got.
reply ok if you understood.

You can build your own ML model based on that pseudo-code, but it'll require a lot of question-answer data and a lot of trial and error to fine-tune the model. And then you can even build your own API based or a full stack product.

But then why just don't use openAI's API and build a webapp that does this?

I thought let's give it a try! I started building a product based on this.

I was also thinking about using tessaract.js to recognise text from images, which uses the popular Tesseract OCR engine. It can even run on both client and server. Using tessaract.js could make the product more helpful as most answers are not written using devices but rather on paper.

After building the landing page I tried implementing the above logic, but as chatGPT remembers the previous conversations, you can't implement the same thing using the API, because it doesn't remember the previous prompt. So, another way of doing this was where I had to include all pseudo-code logic, the question-answer user input in a single prompt when making a request to openAI API.

But the problem is that the "text-davinci-003" model API has a limit of 4000 tokens.

So, practically maybe sometimes it could be impossible to make API calls because the logic code itself is too long and the question and answer too can be even longer!

Or it could be just that I don't have the required knowledge to solve these two problems or there will be even more problems as users start using this product after someone solves the problems stated above.

I also had no idea if I would've succeeded in building this product if anyone would use it, I could have evaluated the idea first by asking in public on Twitter or Reddit. Now it doesn't seem important.

But till then you can use chatGPT to evaluate your answers using the starter prompt I told you at the start and services like google lens or some other OCR engine to get text content from any image or pdf!

Using chatGPT to evaluate answers.

or you can try to build a product based on this.