ChatGPT 4o as a grading assistant
Vincent C Schoots
Cognitive Neuroscientist | Instructor AI | Effective Altruism enthusiast
Ok so, 3 days ago I wrote a post evaluating ChatGPT 4 as a grading assistant (https://www.dhirubhai.net/posts/vincentschoots_education-ai-chatgpt-activity-7195830142634528771-MS-V?utm_source=share&utm_medium=member_desktop)
The day after, ChatGPT 4o came out. ??
So I redid it.
1) The correlation between my grade and ChatGPT's grade dropped from 0.79 to 0.48. There was even one 7 and one 8 point difference on the (French) 20-point scale. Uh-oh! ??
2) BUT unlike with ChatGPT 4, this time training improved performance! (In the graph below, you see how the disagreement (raw residual) becomes less with time.
For fairness, I used the exact same prompts as last time.
What do you say? Is this better or worse than before? ??