BSer, Searcher, Researcher: Validating Generative AI Texts (#GPT and other #LLMs)

BSer, Searcher, Researcher: Validating Generative AI Texts (#GPT and other #LLMs)

I've been thinking a lot about this — #GPT and #LLMs hallucinating both (1) propositions and (2) citations. Users (especially lawyers) need to know that the text is trustworthy. ("Is this BS?") And Users need to know textual provenance. ("Where did this come from?")

To address this problem, there are (at least) three options:

1. Bullshitter

2. Searcher

3. Researcher

No alt text provided for this image

1. BULLSHITTER. LLM hallucinations gone wild. Unchecked chaos.

No alt text provided for this image

2. SEARCHER. LLM generates text. Run queries to substantiate (or debunk) the text.

This is like a senior partner saying "I'm pretty sure there's a case out there that says X. Find it!"

Good luck.

No alt text provided for this image

3. RESEARCHER. Atomize the <PROPOSITION> + <CITATION> graph. Give each <PROPOSITION> and <CITATION> unique identifiers. User's query builds most-common ground truth (non-hallucinated).

No alt text provided for this image

Which to choose?

#1 BULLSHITTER is a nonstarter.

#2 SEARCHER seems obvious. But many rabbit holes.

e.g., sentence = BS hallucinated

e.g., sentence recites bad law (e.g., Plessy, Roe v. Wade)

#3 RESEARCHER is harder to build. But most trustworthy. Built atop ground truth.

No alt text provided for this image

I would expect that most Researchers will turn out to be an adversarial network of a BSer and a Searcher wearing a trenchcoat.

Brian Quinn

Product Manager | Agile | SaaS | B2B

2 年

Good thoughts. I've had the experience of feeding three sequential compliance prompts to ChatGPT, and the third one got no result. That was a bit odd because that item does exist, but made me think that a Confidence Threshold of some sort may be helpful. Zero result may be better than Bullshit. As you well know, the User of these future systems may again be less experienced staff who are at greatest risk of grabbing onto an attractive answer.

Graeme J.

Law + people + messy reality + ways of working + organisations + software + data

2 年

Nicely done, Damien. Succinct and compelling.

要查看或添加评论,请登录

Damien Riehl的更多文章

社区洞察

其他会员也浏览了