Why does ChatGPT provide inaccurate citations?
DALL?E 2: AI Biographer

Why does ChatGPT provide inaccurate citations?

I wanted to see how well ChatGPT understands public figures and can report biographical information.

For my article on breathwork, I asked the AI to summarize the biography of Wim Hof and to include his Guinness World Records. I was happy to see an accurate description.

No alt text provided for this image

I was glad to see it also cited some of the media outlets that have covered Hof, so I wanted to find some of those sources.

It seems ChatGPT has fairly poor short-term memory and didn't full grasp the request of the "him" I was referring to. It shared a number of Guinness World Records related to cold exposure that didn't seem to match the records it provided above.

No alt text provided for this image

Those weren't quite the news articles I was looking for, so I improved my prompt to be more specific. The response appeared to be successful, finding articles in Telegraph, The Guardian, BBC and CNN.

No alt text provided for this image

However, as I clicked on the links...none of them worked!

No alt text provided for this image
No alt text provided for this image

Interestingly, each of those publications have actually covered Wim Hof's achievements. And the titles of the articles are factually correct. But, similar to my prior experience with ChatGPT as a research assistant, none of the articles reported are real!

In searching for the articles' titles the AI provided, none of those headlines have ever been printed within those publications. Rather, it provided article names and links for plausible publications that could exist...but each was a complete fabrication.

Why Does ChatGPT provide inaccurate citations?

This led me to explore more about this limitation of ChatGPT. Despite some interesting, but highly exaggerated and sensational headlines such as "ChatCPT Is Dumber Than You Think" in the Atlantic and "AI Platforms like ChatGPT Are Easy to Use but Also Potentially Dangerous" in Scientific American, they didn't really reveal the source of the problem. It wasn't until I dove into a YCombinator Hacker News thread until I started understanding this issue better.

They key idea is that ChatGPT is a language model, and not a knowledge model. One user explains, "I've seen it referred to as 'stochastic parroting' elsewhere, and that probably gives more insight into what is happening. These large language models are trained to predict the next word for a given input. And they don't have a choice about this; they must predict the next word, even if it means that they have to make something up."

Another user expands, "These model are extremely good when they have creative freedom and are used to produce some kind of art. Poems, text in various styles, images without important details, and so on. But they fail miserably when you give them tasks that require world knowledge or precision."

So it's not that ChatGPT is dumb or dangerous, but that our expectations are misguided about how ChatGPT works and how it can currently be used.

Key Takeaways

  1. ChatGPT is a great creative writer, and can output a good biography. As a biographer, ChatGPT did a decent job researching key facts about a public figure and reporting back a synthesized paragraph citing their achievements. In fact, I fed ChatGPT my resume and key points I use in my bio, and I was really happy at the short bio the AI generated. It was concise and well written. I could definitely see tools being developed to write effective bios and resumes.
  2. ChatGPT is a language model, and not a knowledge model. The most surprising finding so far is the AI's inability to provide citation links. This seems like such an easy task: simply report back an article title and link that exists on the internet. Yet it completes a much more complex and fraudulent task: it makes up article titles and URLs that seem like they should exist, but they don't. This undermines the credibility of ChatGPT for the public, and even with understanding its limitations, it still makes me fearful to trust the AI's accuracy in other ways. The implications of this shortcoming is that we all must manually fact check any output of ChatGPT before using the generative copy it develops.
  3. GPT-4 may put this issue in the past. The current GPT-3 model is trained on 175 billion parameters. Comparatively, the upcoming GPT-4 model is trained on 100 trillion parameters! Some speculate that the training set for GPT-4 is equivalent to nearly 25% of the entire public internet. It seems like with this increased base for learning, we will move towards something that feels more like a knowledge model as its trained to provide more comprehensive and accurate output.

No alt text provided for this image
Marquis Rosen

US Marine Veteran Turned Entrepreneur: Leading unique Web Design, Event Hosting, and Non-Profit Transformations with Vibrant Visions Consulting and DDR Sanctuary.

1 年

Thanks for sharing

回复

Now complete the circle. Assign a machine reader to assess your machine written piece.

回复
Les Shute

AI Innovation Leader | Chief Innovation Officer @ Infinity Nurse Case Management | Generative AI & Digital Transformation | Microsoft Certified

1 年

Great insights, Josh! I totally agree that this train has left the station and will continually improve over a relatively-short time. What some thought would be impossible until decades from now is actually happening.

回复
Almira Osmanovic Thunstr?m

Developer, Innovator and AI Researcher

1 年

Most peer review and documentation is behind paywalls. LLMs are trained mainly on open crawl. It has nothing of value to learn from. The same is not for GPT-3.5 as there was an inclusion of arXiv in its training data, but the sources are all outdated. That is my theory :).?

回复
Valentino Megale, PhD ??

?? Startup founder innovating #healthcare and pain management with #XR tech at Softcare Studios | ?? Digital Safety & Privacy at XRSI | ?? Startup mentor & Lecturer on Emerging Technologies | PhD in Neuropharmacology

1 年

In the meantime, https://www.perplexity.ai/ looks like a useful tool fixing the gap around the references.

要查看或添加评论,请登录

Josh Sackman的更多文章

社区洞察

其他会员也浏览了