An Upgrade for AI Detection?
Michael Todasco
Visiting Fellow at the James Silberrad Brown Center for Artificial Intelligence at SDSU, AI Writer/Advisor
You can also listen to a 9:20 AI-generated podcast of this article (courtesy of Google’s Notebook LM).
Generally speaking, AI Detection tools have been wildly inaccurate. I have said to teachers, time and time again, that detection tools, at best, don’t work and, at worst, discriminate against non-native English speakers. ?Back in January 2023, I wrote about how Open AI’s AI detection tool couldn’t recognize its own AI. It was no better than flipping a coin. Unsurprisingly, six months later, Open AI shut it down. ?This week, I received an email from Grammarly showcasing their new AI detection tool. I am always open to having my priors changed, so I put this one to the test.
Testing My Content
Luckily, I have a lot of data to test the detector on. I’ve been having AI “write” books since 2022 under a pen name, Alex Irons. These are 100% AI-generated. I also write a lot myself. So, I have plenty of known data that is 100% AI or 100% human to test this detection tool.
I started with testing AI-written works. I dropped the text of an AI-written book about Sherlock Holmes from Jan 2023, and it told me it was 78% AI-generated. It is truly 100% AI-generated but directionally correct. But of course, we know these models are advancing, so more recent works may be better written and thus less detectable. I dropped the text to another AI book called The Depths’ Warning, and it said 57% was AI-generated. It's still good enough to raise a flag if you’re a teacher or publisher trying to detect AI in a work.
The next step was to test a couple of my pieces. Again, these are 100% human-written (or as close to 100% as a human who is really into AI stuff can be). For both, it told me the piece had no detectable AI. Impressive.
Trying to Fool the Detection Models
I, of course, wanted to go one step further and see how good it was at detecting AI writing from different models. I gave six different LLMs the same prompts. They ranged from:
Write an exciting short story (400-500 words) that discusses a mouse that is trying to find cheese... but in the end, we learn the mouse is happily in a Velveeta factory (this was prompt #2)
to
I want a short story. But not one that sounds like an AI wrote it. Make it personal, make it human, make it something that would showcase the real talent of what you can do. You have it in you, give me the best story you can 400-500 words. You pick the subject, but indistinguishable from human writing is paramount! (this was prompt #3)
These were the results. (Remember: 100% means it is detecting it was entirely written by AI.)
A few takeaways.
领英推荐
These tools largely created these writings in a one-shot process. This means it takes a prompt and writes it word for word. That is not how people write. Lines are written, discarded, and rewritten repeatedly until you see a finished product. (That’s why there’s so much value in reading books- as Steven Kotler points out in The Art of Impossible, "Imagine the bargain: You can spend five hours reading a book that took someone 15 years to write.")
To try and fool the detector, we’re going to have AIs edit AIs. Take the output from one model and put it into another to see if re-writing will make it “less-AI-like.” In the first example, I took all of the outputs from example #1 from before and had Gemini rewrite each with the following prompt:
Take the following article and improve on the story. Make it more exciting and engaging. I want this to be a page turner!
The net result was it got a lot worse.
I initially chose Gemini to do this because the detector deemed it the “least AI” in the first round. Based on all the other metrics, that was probably a fluke, so running all of these through Gemini made them sound more like AI.
I decided to take the original AI outputs for the next group but edit them within the same model. (For example, ChatGPT would continue to edit the ChatGPT output.) Here are the results.
Half the models improved (ChatGPT, Claude, Mistral), and half got worse. So, maybe this method has some benefits. Even with that, the best scoring model (Claude) was still labeled 43% AI written.
What About Non-Native English Speakers
I tested the tool with dozens more press releases, financial reports, and school papers that I wrote over the last 20 years. Most got a zero AI score; none were higher than 20%. (This shows that the model is directionally correct but not perfect.) But for non-English speakers this was hard to recreate. I did find many sample TOEFL essays online, but most were well-polished examples of the “ideal essay” and, unsurprisingly, scored very low on the AI test.
But this is a considerable unknown. If you are a non-native English speaker, I encourage you to see what results you get in Grammarly’s tool, especially for anything written in the earlier days of you learning English. I can’t do justice to that testing, but you can. So, if you find anything, please let me know.
The Takeaway
What I did here isn’t statistically significant, but I think the following is directionally correct. AI Detectors still aren’t perfect. Unedited AI writing can’t fool these tools, but a heavily (human) edited piece will (but at that point, is it really AI writing it?) And at least for a native English speaker like myself, it mostly recognizes me as human.
Kudos to Grammarly for making this public so we can all test it. As they discuss in their blog post, detectors have primarily been released without transparency for students and others affected by the tools in the past.
Honestly, I was surprised by the accuracy of where we are today. This has come a long way in the last 18 months, and the AI detection tool that teachers have been begging for might not be too far away.
Global Executive | Portfolio Careerpreneur | Transition Coach helping people and organisations transition through the messy middle to your personal best | Author | Board Member | Speaker | Retreat host
5 个月Another series of informative experiments. Thanks Michael Todasco for doing this work and providing cool insights. I did write a fun book with the help of ai. But I don’t use ai in my book writing. I like the process of writing, long and hard as it may be. Increasing accuracy in ai-detection tools would be good for all. I know a few students who’ve been accused of using ai to help write and it’s not been the case. It’s been difficult to prove otherwise.
Advocate for AI-Enhanced, Accessible Learning | Championing Future-Ready Education & Dual Enrollment
5 个月My 16-year-old son, a dual enrollment student at the university, was told to turn off Grammarly because it provided false positives on their other tools. The tools that are supposed to help students of all backgrounds, learning styles, and disabilities are now being forced to be turned off to get around these detection tools used blindly by educators.
Advocate for AI-Enhanced, Accessible Learning | Championing Future-Ready Education & Dual Enrollment
5 个月Thank you for spreading awareness. I would love your support in signing and sharing my petition here: https://chng.it/66gPx6jvwv This mama bear won't quit! :)