登录查看更多内容

ChatGPT failures aren't "hallucinations"

Brian Butler

Dean of the College of Communication and Information Sciences

发布日期: 2023年5月23日

As Large Language Models (LLM) have gained popular attention, much has been written about their potential benefits and risks. In these discussions, it has also been noted that LLM generated texts may include information that, while confidently stated, is completely incorrect. LLMs will generate references, article titles, quotes, dates, places, URLs, and other items that are non-existent.

When explaining these faulty constructions, many commentators have taken to saying that the LLM is "hallucinating". But what exactly does this mean?

Clinically, hallucinations are sensory perceptions that occur in the absence of actual external phenomena. In medical resources and popular descriptions, hallucinations are presented as disorders arising from causes such as neurological disturbances or drugs. Asserting that LLMs are hallucinating is to imply that they have formed a flawed model of the world due to an abnormal influence. But this is not what is occurring when an LLM manufactures a flawed statement.

Referring to LLM failures as hallucinations obscures the extent to which LLM are truly disconnected from reality, downplays the power of communicative statements, and gives LLM an unprecedented 'benefit of the doubt'.

Large language models have no connection to reality. They do not have any access to or any way of processing primary data, sources, or facts. No matter what prompt you provide, an LLM will alway respond with a text that seems most likely to exist given the set of documents it was trained on. LLMs are designed to generate plausible texts, not accurate ones. An LLM generated text containing a manufactured quote or a non-existent article reference is not an abnormality; it is simply a case when it is easier for a reader grounded in the real world to detect the tools' 100% focus on text plausibility and 0% concern for accuracy.

Referring to LLM failures as hallucinations also conflates perceptions and statements. Perceptions matter to the individual, but it is when they become statements that they acquire social, legal, and practical power. You are free to believe whatever you wish about your neighbor, but if you make a public statement accusing them a crime and materially damage their reputation you may be guilty of libel. When perceptions become statements they gain the power to change the world. LLM failures are significant not because they are 'in their minds', but because they are in their statements.

Since hallucinations are due to abnormal conditions and are fundamentally internal, characterizing LLM failures as hallucinations has the effect of encouraging us to give this technology 'the benefit of the doubt'. When we are told someone is hallucinating, we can nod understandingly and hope they get the help they need. At the same time, we doubt the veracity of anything they say and impose limits on their public speaking until they recover their normal faculties. It is this logic that commentators characterizing LLM failures as hallucinations are encouraging us to adopt for the technology.

However, this logic can't meaningfully be applied to large language models. If we accept that LLM are internally hallucinating, we should discount any statements they make and prevent them from engaging in influential communication, at least until they recover. But what does it mean to have a communication tool that we don't allow to make statements? Moreover, disconnection from reality is not an abnormal state for LLMs. It is how they are made. so they can't recover. Referring to LLM failures as hallucinations is at best a misleading distraction and at worst, a fundamental error which will lead us to incorrectly assess the true usefulness of this new technology.

When an LLM generates a flawed statement, it is failing. Like a calculator that doesn't add properly, it loses justification for existence. Being honest about this, both removes the sense of panic associated with the 'risks' and allow us to more clearly discover the true benefits of the technology.

Epilogue

I was unable to get either chatGPT or Google Bard to draft a blog post about LLM failures as hallucinations. No matter what they prompt, the resulting texts referred to LLM "making mistakes" and never used the words flaws, failures, or errors. Mmmmmm.....

Katy Oskoui

Independent Fundraising Consultant

1 年

Excellent post - thank you!

1 次回应

Victoria Van Hyning

Assistant Professor of Library Innovation at University of Maryland iSchool

1 年

Yes! Really enjoyed this.

1 次回应

Mangal A.

Product Builder in Tech?Data

1 年

Agreed. I think LLMs are great in creative fields. But need significant improvements before they can be “source of facts” or anything based on facts.

查看更多评论

要查看或添加评论，请登录

Brian Butler的更多文章

MenardGPT, Author of the Quixote

2024年2月23日

MenardGPT, Author of the Quixote

(With apologies to Jorge Luis Borges and Pierre Menard) Publicly available examples of the amazing capabilities of…

3 条评论
The real threat of AI: Simplistic thinking and lower standards

2023年6月20日

The real threat of AI: Simplistic thinking and lower standards

When #gpt4 was released, OpenAI reported its performance on the GRE, LSAT, SAT and other standardized tests…

11 条评论
Generative AI and the Digital Burden: An Alternative Take on AI and the Digital Divide

2023年6月11日

Generative AI and the Digital Burden: An Alternative Take on AI and the Digital Divide

In the story The Midas Plague, Frederick Pohl imagines a time when cheap energy makes it possible to produce an…

3 条评论
How Generative AI Can Help Close the Digital Divide

2023年6月10日

How Generative AI Can Help Close the Digital Divide

Generative AI (#gai) is a rapidly developing field that has the potential to revolutionize the way we learn…

1 条评论
Digital Stone Soup (or How to Read Generative AI Tutorials)

2023年5月29日

Digital Stone Soup (or How to Read Generative AI Tutorials)

The visitor promised the villagers a marvelous meal and then made a delicious soup with only a common stone (and a few…

2 条评论
ChatGPT, the British Coronation, and Alabama -- A Cautionary Tale

2023年5月6日

ChatGPT, the British Coronation, and Alabama -- A Cautionary Tale

In an effort to better understand the potential of chatGPT, Google Bard, and other Large Language Model (LLM) based…

5 条评论
ChatGPT Nails it: Explaining Agency to Advertising, Philosophy, and Business Students

2023年5月4日

ChatGPT Nails it: Explaining Agency to Advertising, Philosophy, and Business Students

As a language model designed to generate human-like responses to questions and prompts, ChatGPT's explanations of…

1 条评论
UMD iSchool Graduate Open House

2019年9月28日

UMD iSchool Graduate Open House

Graduate studies at the University of Maryland's College of Information Studies (UMD's iSchool) prepare you for success…
The Satisfaction Trap

2016年1月9日

The Satisfaction Trap

An earlier post (Treating Students Like Customers is Irresponsible) examined problems that arise from thinking about…

3 条评论
7 Ways to Help Your Team Thrive Under Pressure

2015年12月31日

7 Ways to Help Your Team Thrive Under Pressure

Sometimes teams are capable of amazing things ..

2 条评论

See all articles

ChatGPT failures aren't "hallucinations"

Brian Butler

Dean of the College of Communication and Information Sciences

Brian Butler的更多文章

社区洞察

其他会员也浏览了

The risks of Large Language Models (such as ChatGPT)

Why large language models like ChatGPT are bullshit artists, and how to use them effectively anyway

Why ChatGPT Can’t Write Your Personal Statement

ChatGPT, and what it means for translators

Dummies Guide to Understanding LLM

What media format will ChatGPT and AI bring back that was previously obsolete?

Are you polite to ChatGPT? Navigating the risks of treating LLM's as human

A legal perspective on generative AI and ChatGPT

Is the Daily Show using ChatGPT?

Mediation and AI: The Silent Revolution - How Human is ChatGPT?

Brian Butler的更多文章

MenardGPT, Author of the Quixote

The real threat of AI: Simplistic thinking and lower standards

Generative AI and the Digital Burden: An Alternative Take on AI and the Digital Divide

How Generative AI Can Help Close the Digital Divide

Digital Stone Soup (or How to Read Generative AI Tutorials)

ChatGPT, the British Coronation, and Alabama -- A Cautionary Tale

ChatGPT Nails it: Explaining Agency to Advertising, Philosophy, and Business Students

UMD iSchool Graduate Open House

The Satisfaction Trap

7 Ways to Help Your Team Thrive Under Pressure

社区洞察

其他会员也浏览了

The risks of Large Language Models (such as ChatGPT)

Why large language models like ChatGPT are bullshit artists, and how to use them effectively anyway

Why ChatGPT Can’t Write Your Personal Statement

ChatGPT, and what it means for translators

Dummies Guide to Understanding LLM

What media format will ChatGPT and AI bring back that was previously obsolete?

Are you polite to ChatGPT? Navigating the risks of treating LLM's as human

A legal perspective on generative AI and ChatGPT

Is the Daily Show using ChatGPT?

Mediation and AI: The Silent Revolution - How Human is ChatGPT?