I Caught 16 US Presidents Using ChatGPT
Generated by Author and AI

I Caught 16 US Presidents Using ChatGPT

This story is about AI-generated text detectors and their scoring capabilities.

While preparing slides, writing an article, or formulating an email, I occasionally use this or that AI-generating platform. I've always checked that the percentage of Ivan-generated text is over 90% - that's my bar. Surprisingly, I've recently noticed that some of my human-written text got labeled AI-written. As I gave on the idea of reaching the height of AI intelligence capabilities long ago, I started to suspect something strange. I checked several other platforms, and almost half claimed the same. If working with LLMs was a medical procedure, I would've understood such side effects. Still, in this case, I decided to go deeper.?

This is the story of how ChatGPT texts get caught.

In order to confirm my suspense, I had to reduce the chance that the text was AI-generated.?Something trackable across time would be great. This is where the idea of using inaugural speeches appeared.

I had to figure out if ChatGPT knew the text. So I gave George Washington a try:

No alt text provided for this image
Generated by Author

Yep. ChatGPT knows. In order to reduce the bias, let's ask if he knows them all:

No alt text provided for this image
Generated by Author

Yep, no need to deny it. I know you know every single word.

It is common knowledge how Large Language Models were taught - on large amounts of text data. From there, they've picked up patterns of how people write.

There are two main metrics that are used to estimate the probability of text being AI-generated: perplexity and burstiness.

Perplexity measures how well AI language models can predict the next word only based on the words that came before in the sentence. Imagine you're reading a story, and the next word in a sentence should fit well with the context of the previous text.?

If you think about it, GPT and other models must predict what word should appear next while generating text. So this is similar to reverse engineering. If the AI has a high perplexity score, it's struggling to guess the right word and might need to understand the flow of language better. On the other hand, a lower perplexity score indicates that the AI is more accurate at predicting the following words, making the text sound more natural and coherent.

Burstiness is related to the variation in sentence lengths.?Think about how people sometimes use short, snappy sentences to convey simple ideas. In contrast, at other times, they use longer, more complex sentences to explain intricate concepts. An AI with good burstiness can mimic this natural variation and make its text more engaging and authentic.

Ok, as we're done with the theory, let's look at the data we have (the link to the code will be in the comments below). It is worth mentioning that the last speech used is dated 2017 (Trump) to avoid possible bias.

No alt text provided for this image
Generated by Author

After some cleaning, we can start looking into different AI detectors. After browsing several pages of Google, I found four tools that have their API open, and there are no restrictions on the number of texts per unregistered user:

  • [zerogpt.com]: https://api.zerogpt.com/api/detect/detectText
  • [writer.com/ai-content-detector]:https://writer.com/wp-admin/admin-ajax.php
  • [contentatscale.ai/ai-content-detector]: https://contentatscale.ai/ai-content-detector/
  • [crossplag.com/ai-content-detector]: https://j1o8u6du62.execute-api.eu-central-1.amazonaws.com/production/detect

After formatting all results, we can group the results by Name:

No alt text provided for this image
Generated by Author

Quite intriguing! Out of our four tools, no single one identified all speeches as completely human-written. The "contentatscale" detector was the closest, although it also stumbled across one of the US presidents. If you're wondering why there are only 39 unique names, I'll tell you a fun fact:?there are 5 US Presidents that did not do an inaugural speech.

As we are comparing US Presidents amongst each other, we can apply a ranking function to sort them by "detection scores":

No alt text provided for this image
Generated by Author

According to the data, Warren G. Harding is the lowest-ranked ChatGPT-using President. Despite having a high "crossplag" score, the other three ai-detection tools found him innocent. Contrary to George W. Bush, whose inaugural speeches were found guilty at least three times.

No alt text provided for this image
Generated by Author

Now let's decide on a threshold. While the soft-guilty threshold can be set to 10%, let's assign the hard-guilty metric to 25%. That means if at least one of the metrics is higher than that, we can assume we've used ChatGPT or similar tools.

In this case,?16 of 39?US Presidents used ChatGPT in the inaugural preparation. Richard Milhous Nixon wasn't even hiding it that much! "Crossplag" tool gave him a 100% ai-generated score, and "zerogpt" estimated that only 40% was written by humans.

As conclusion, I want to underline that I'm actually not claiming any of the former Presidents are guilty of using ChatGPT. This article aimed to tell about how generated text detectors work and show that their estimations are sometimes far from correct. If you're using Bard, ChatGPT, or other LLM, you might want to use different detectors if you care about the text being claimed as human-written.

No alt text provided for this image

I would appreciate your support if you've enjoyed the article. Until next time!

#chatgpt #gpt #ai #python #pandas

CHESTER SWANSON SR.

Next Trend Realty LLC./ Har.com/Chester-Swanson/agent_cbswan

1 年

Thanks for the updates on, The Newsletter for ML enthusiasts.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了