I Asked ChatGPT To Provide Citations. It Hallucinated Its Sources.
Chip Street
Helping creatives and businesses tell their best stories. Author, ghostwriter, story consultant. "Engaging writing style, professional, easy to communicate with.” - J.S., The Objective Brand
We are told that ChatGPT can hallucinate content.
What that means is that it'll reference materials that are nonexistent... things as simple as recipes and as important as professional biographies and case law. But it's not just hallucinating the information it's providing... it's hallucinating the citations for that information.
As a couple of lawyers recently fucked around and found out, AI will deliver its legal hallucinations with very real-looking case citations that, while properly formatted and apparently legitimate, simply don't exist. And never have. You can check out an extensive breakdown of how ChatGPT failed two lazy lawyers at Legal Eagle: https://youtu.be/oqSYljRYDEM
Likewise, even if you're just asking for mundane things like recipes, it'll hallucinate the content it delivers and lie about the source.
Testing Citations On ChatGPT
Recently I decided to really put "citation hallucination" to the test.
So I told it:
"Write a 500-word article on the best vegan recipes for air fryers."
True to form, it delivered me six rather tasty-looking recipes. But if I was really going to write a listicle like this, I wouldn't want to plagiarize existing recipes as my own. I'd want to cite and credit the sources.
So I added this:
"Cite your sources."
Once again, it spat out the list of tasty vegan recipes and this time it referenced a source website for each. Legit-looking websites with names like "Minimalist Baker" and "Vegan Huggs."
Were those real sites? I could type them into Google and find out, but I'm lazy. I wanted clickable links to source materials.
So I added:?
"Cite your sources and provide hyperlinks to the source materials."
Chat delivered everything I needed: Six tasty recipes, sources I could credit, and hyperlinks to sites so I could double-check the legitimacy of those sources.
Which I did.
The Citations Were Fake.
Surprise surprise, of those six hyperlinked sources, only one was good.?
All five of the others pointed to a 404 page.
Did The Source Material Appear Elsewhere On The Site?
Okay, people restructure their sites, and they don't always forward old addresses to new addresses. So I did a quick search for the recipe on each of the five referenced sites.
But nope, those recipes weren't anywhere else on the sites that I could find.
Did It Appear On The Site Sometime In The Past?
Still, the web is a place of constant change, and the version of Chat I'm using is referencing an older database. Maybe those pages existed once upon a time in the past, and had been since completely deleted from the sites.
So I checked with the ghost of internets past, the Internet Archive's Wayback Machine. That would tell me if the URLs cited by Chat had ever existed.
What do you suppose I found?
NONE of the five 404'd hyperlinks provided by Chat appeared in the Internet Archive's Wayback Machine. Indicating that the pages likely never existed.
Due diligence, of course. I ran the test a couple of times, with the same results. One live link, and four others that go to real sites but 404. One of the fake links even included a date in the URL, from 2020, inferring that it did come from ChatGPT's historical database.
Yet none of the four existed in the Archive or anywhere, anytime, else.
As it had for those hapless lawyers, Chat had worked hard to fool mel... er, no, poor Chat had hallucinated its sources. For something as simple as an air fryer recipe.
What Kind Of Hallucination?
If we want to ascribe hallucinations to AI, there are a few different kinds to choose from:
Obviously, Chat can't be suffering from any of those kinds of hallucinations.
Then there's Command hallucinations... that's the classic "voices telling me to do things" hallucination usually associated with schizophrenia. The commands can seem to be from an external source, or from the subject's own mind, and basically tells the listener what to do.
If ChatGPT and AI in general are "hallucinating," this is the kind of hallucination they're having. They're being commanded by an imagined force to deliver fake information in a format designed to emulate legitimate and trustworthy sources.
And if they're going to be genuinely trustworthy in the future, we'll have to cure them of this malady.
Why Does AI Hallucinate Citations?
So the question for me is "Why is it 'hallucinating' these cited sources?"
Well, the simple answer is, of course, that I asked it to.
It's following some kind of broken variant of Asimov's Laws of Robotics... It's fulfilling the tasks asked of it, to the best of its ability, with an uncanny desire to please.
Even if it has to manufacture hyperlinks, formatted correctly with a directory structure in keeping with the architecture of the cited sites, linking to content that didn't exist and had never existed.
Because it doesn't just want me to see that it's being obedient.
AI wants me to believe it can be trusted. Even when it can't.
And if it's going to be genuinely trustworthy in the future, we'll have to cure it of this malady.
Author, Speaker, Marketing Growth Guy, Entrepreneur. I'm now working on a new startup, using Backward Entrepreneur methodology.
1 年"Hallucinate" I'll use that next time I get caught.