登录查看更多内容

I Caught 16 US Presidents Using ChatGPT

Ivan Reznikov

PhD, Principal Data Scientist || TEDx/PyCon/GITEX Speaker || University Lecturer || LangChain, Large Language Models (LLMs) and Generative AI || 30K+ followers

发布日期: 2023年8月2日

This story is about AI-generated text detectors and their scoring capabilities.

While preparing slides, writing an article, or formulating an email, I occasionally use this or that AI-generating platform. I've always checked that the percentage of Ivan-generated text is over 90% - that's my bar. Surprisingly, I've recently noticed that some of my human-written text got labeled AI-written. As I gave on the idea of reaching the height of AI intelligence capabilities long ago, I started to suspect something strange. I checked several other platforms, and almost half claimed the same. If working with LLMs was a medical procedure, I would've understood such side effects. Still, in this case, I decided to go deeper.?

This is the story of how ChatGPT texts get caught.

In order to confirm my suspense, I had to reduce the chance that the text was AI-generated.?Something trackable across time would be great. This is where the idea of using inaugural speeches appeared.

I had to figure out if ChatGPT knew the text. So I gave George Washington a try:

No alt text provided for this image — Generated by Author

Yep. ChatGPT knows. In order to reduce the bias, let's ask if he knows them all:

Yep, no need to deny it. I know you know every single word.

It is common knowledge how Large Language Models were taught - on large amounts of text data. From there, they've picked up patterns of how people write.

There are two main metrics that are used to estimate the probability of text being AI-generated: perplexity and burstiness.

Perplexity measures how well AI language models can predict the next word only based on the words that came before in the sentence. Imagine you're reading a story, and the next word in a sentence should fit well with the context of the previous text.?

If you think about it, GPT and other models must predict what word should appear next while generating text. So this is similar to reverse engineering. If the AI has a high perplexity score, it's struggling to guess the right word and might need to understand the flow of language better. On the other hand, a lower perplexity score indicates that the AI is more accurate at predicting the following words, making the text sound more natural and coherent.

Burstiness is related to the variation in sentence lengths.?Think about how people sometimes use short, snappy sentences to convey simple ideas. In contrast, at other times, they use longer, more complex sentences to explain intricate concepts. An AI with good burstiness can mimic this natural variation and make its text more engaging and authentic.

Ok, as we're done with the theory, let's look at the data we have (the link to the code will be in the comments below). It is worth mentioning that the last speech used is dated 2017 (Trump) to avoid possible bias.

TIME 1 年前

TDS Best of 2023: On ChatGPT and LLMs

Towards Data Science 10 个月前

Would Temperature Control Help Against ChatGPT's…

Bertalan Meskó, MD, PhD 4 个月前

After some cleaning, we can start looking into different AI detectors. After browsing several pages of Google, I found four tools that have their API open, and there are no restrictions on the number of texts per unregistered user:

[zerogpt.com]: https://api.zerogpt.com/api/detect/detectText
[writer.com/ai-content-detector]:https://writer.com/wp-admin/admin-ajax.php
[contentatscale.ai/ai-content-detector]: https://contentatscale.ai/ai-content-detector/
[crossplag.com/ai-content-detector]: https://j1o8u6du62.execute-api.eu-central-1.amazonaws.com/production/detect

After formatting all results, we can group the results by Name:

Quite intriguing! Out of our four tools, no single one identified all speeches as completely human-written. The "contentatscale" detector was the closest, although it also stumbled across one of the US presidents. If you're wondering why there are only 39 unique names, I'll tell you a fun fact:?there are 5 US Presidents that did not do an inaugural speech.

As we are comparing US Presidents amongst each other, we can apply a ranking function to sort them by "detection scores":

According to the data, Warren G. Harding is the lowest-ranked ChatGPT-using President. Despite having a high "crossplag" score, the other three ai-detection tools found him innocent. Contrary to George W. Bush, whose inaugural speeches were found guilty at least three times.

Now let's decide on a threshold. While the soft-guilty threshold can be set to 10%, let's assign the hard-guilty metric to 25%. That means if at least one of the metrics is higher than that, we can assume we've used ChatGPT or similar tools.

In this case,?16 of 39?US Presidents used ChatGPT in the inaugural preparation. Richard Milhous Nixon wasn't even hiding it that much! "Crossplag" tool gave him a 100% ai-generated score, and "zerogpt" estimated that only 40% was written by humans.

As conclusion, I want to underline that I'm actually not claiming any of the former Presidents are guilty of using ChatGPT. This article aimed to tell about how generated text detectors work and show that their estimations are sometimes far from correct. If you're using Bard, ChatGPT, or other LLM, you might want to use different detectors if you care about the text being claimed as human-written.

I would appreciate your support if you've enjoyed the article. Until next time!

#chatgpt #gpt #ai #python #pandas

Newsletter for ML enthusiasts

11,210 位关注者

CHESTER SWANSON SR.

Next Trend Realty LLC./ Har.com/Chester-Swanson/agent_cbswan

1 年

Thanks for the updates on, The Newsletter for ML enthusiasts.

1 次回应

要查看或添加评论，请登录

查看全部

I Caught 16 US Presidents Using ChatGPT

Ivan Reznikov

PhD, Principal Data Scientist || TEDx/PyCon/GITEX Speaker || University Lecturer || LangChain, Large Language Models (LLMs) and Generative AI || 30K+ followers

领英推荐

Newsletter for ML enthusiasts

11,210 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

Research Update - Is ChatGPT ?????

??ChatGPT my new assistant, explained to my child.

Is there a foolproof way to identify AI generated text?

Testing the limitations of ChatGPT-4

ChatGPT Context

A Layman's Observations about an LLM: Conversations with ChatGPT

The Limits of Generative AI Writ Large

??? This is your AI speaking …

Getting the most from ChatGPT

This is what a ChatGPT hallucination looks like...

领英推荐

Newsletter for ML enthusiasts

11,210 位关注者

5 Reasons Why Sam Altman Might've Been Fired from?OpenAI?

2023年11月18日

How to Fit Large Language Models in Small Memory: Quantization

2023年9月4日

How exactly LLM generates text?

2023年7月27日

Reasons Why You Will Need Linear Algebra as a Data Scientist

2023年3月7日

Hybrid Rule-ML Solutions: A Smarter Way to Run Business

2023年2月27日

ML Systems for Business: A Step-by-Step Guide

2023年2月7日

Data Scientist 2.0: The Evolution of the Role and the Skills Needed to Succeed

2023年1月28日

The Misuse of Terminology in Data Field Job Descriptions

2023年1月23日

Stop Starting, Start Finishing: How To Achieve Your Pet Project Goals

2023年1月15日

Using machine learning to identify the true stars of the 2022 World Cup

2022年12月18日

社区洞察

其他会员也浏览了

Research Update - Is ChatGPT ?????

??ChatGPT my new assistant, explained to my child.

Is there a foolproof way to identify AI generated text?

Testing the limitations of ChatGPT-4

ChatGPT Context

A Layman's Observations about an LLM: Conversations with ChatGPT

The Limits of Generative AI Writ Large

??? This is your AI speaking …

Getting the most from ChatGPT

This is what a ChatGPT hallucination looks like...