How to Detect AI-Generated Answers in Student Assignments? A List of 11 Cues

Had a good session today with Dr. Waleed Akhtar, PhD (Comp Sci) .

We went through some principles for detecting AI-generated answers in real student submissions.

In case you're interested, here's a list of the alarming cues we identified:

  1. no grammar mistakes ==> ChatGPT makes virtually no grammar mistakes. In contrast, grammar mistakes are very common in freely written student responses.
  2. use of peculiar words (e.g., "foster") ==> when you see these, you think, "there's no way the student even knows what this word means". There are certain expressions ChatGPT uses which you'll start to recognize after using it for some time.
  3. long sentences ==> ChatGPT tends to write long sentences. Not short ones.
  4. lists ==> ChatGPT often makes extensive lists of concepts, covering a broad range of topics. Students can also do this, but often not to the same extent.
  5. tautologies (redundant words) ==> ChatGPT repeats the same words, especially adjectives, several times (e.g., "This can lead to gentrification and displacement of lower-income residents"; here, "gentrification" and "displacement" mean the same thing). Again, students can also do this, but it's much more common among ChatGPT.
  6. empty rhetoric ==> these are sentences that sound good but are void of meaning. Again, students can write them but they're very common among ChatGPT-generated answers (e.g., "In our pursuit of excellence, we consistently leverage innovative synergies and dynamic strategies to empower our stakeholders and enhance our global footprint, ensuring unparalleled success in the ever-evolving landscape of the future." ==> sounds good, means nothing).
  7. paragraph ending with a normative statement => for example, we ask the student to define fraudulent activities in platform business and give examples. The answer ends in a sentence like, "Platforms should also incorporate methods to identify and prevent market manipulation, such as monitoring trade activity, ensuring transparency, and educating users about the dangers involved." This is a huge red flag, as ChatGPT tends to take a moral disposition even when not asked to do so.
  8. positivity bias ==> trying to create an overly optimistic vision of the world, often expressed in normative statements (e.g., "Platforms need to go beyond legal requirements and consider the broader impact they have on society to ensure fairness and positive outcomes for everyone involved."). Once again, students can also have such pipe dreams, but it's very typical ChatGPT has a positivity bias (or "trying to save the world complex").
  9. off-topic / redundant information ==> for example, we may ask, "Explain market manipulation in platforms using cryptocurrency as an example." The answer: "Market manipulation on cryptocurrency platforms may have serious effects, including as financial losses for uninformed traders, reputational damage to the coin, and governmental attention. As a result, traders must undertake extensive research, exercise prudence, and be aware of unexpected price increases or too aggressive marketing efforts. Platforms should also incorporate methods to identify and prevent market manipulation, such as monitoring trade activity, ensuring transparency, and educating users about the dangers involved." The answer is much broader than what we asked. These much broader answers that contain more information than needed to answer the question are suspicious because ChatGPT tends to have the problem of providing more than what was asked for.
  10. fixation on fixing ==> the AI tries always to tell what should be done even when we only ask about a given phenomenon. Like, you ask about a crime, and it starts telling you how to prevent crimes. (This is another variation of the aforementioned positivity bias.)
  11. lengthy ==> longer answers more likely to AI-generated (this is residual effect of the off-topic).

...one alarming cue may not be enough to raise suspicion up to a level where you need to contact the student and ask for clarification on how they did the assignment, but when they accumulate, a clear pattern usually forms.

What we also observed was that questions that require thinking are good separators (good thinking expressed in very good but average-sounding language is an indicator; good thinkers also express themselves more uniquely). Furthermore, most students who use the AI don't appear to be editing their answers (or edit it very little)! They pretty much copy the AI-generated answer as is.

Overall, AI has the risk of strongly mitigating students' own thinking. This carries a huge risk to society; it can lead us to raise robots that are even more easily controllable than people in the past. We can already see that the new generation lacks critical thinking much the same way as the older generations (being "woke" is not critical thinking, by the way, it's just another form of following the pack). So, educators need to be awake here. Waleed and I take this very seriously in our upcoming courses and are now creating countermeasures (combining carrots and sticks).

Arash Sammander

CCO, Co-founder // Sr. Lecturer // Entrepreneur // “I help empower those around me, so that they can do the same.”

8 个月

Thanks for this. They (A.I.) also love to not only create lists but subtitles on every line. One way I’ve worked around this that I ask first their beginning assignments to write about personal topics like their life, passions, etc. Then I have a frame of reference for your voice and style. So when I’m in doubt, I can refer back to their original work. :p We have a very short time before these things won’t work anymore. I’ve spent more time helping my students learn to use it as a tool and not a crutch.

Ausrine Silenskyte

Researcher-Program manager

9 个月

Brilliant summary Joni! Completely agree and used similar 'tests', but you made it clear and explicit all in one place. Thanks!

Minna-Maarit Jaskari

Program Manager - University Lecturer - LegoSeriousPlay fasilitator

9 个月

My assignments allow the use pretty well (thanks to you Joni I have become even more liberal). One thing to look for is how ' is marked. Our system and AI uses it differently. But the strange words are actually typically not the clue. Foster, enhance, delve into, realm of - words that often seem as AI are very typically used words on qualitative and interpretative (consumer) research. But as a teacher, I easily point out too good text. But then again - if the text is not good enough I ask why the student didn't use AI to improve argumentation. And I teach them how it can be done. Students are at this point of time very scared of using generative AI. We really need to teach them better. You are making good efforts in this, Joni. Thanks!

Ahmed Zidane

Data Scientist @Zeal

9 个月

But what if we provided these tips to the model to avoid them while generating answers?

回复
Trang Xuan

Doctoral student @Uni Vaasa | Marketing

9 个月

and write very long paragraphs without any references ??

要查看或添加评论,请登录

Joni Salminen的更多文章

社区洞察

其他会员也浏览了