登录查看更多内容

BSer, Searcher, Researcher: Validating Generative AI Texts (#GPT and other #LLMs)

Damien Riehl

Lawyer + Speaker + Writer + Builder + Mediocre Coder + Musician + VP Solutions Champion

发布日期: 2023年2月1日

I've been thinking a lot about this — #GPT and #LLMs hallucinating both (1) propositions and (2) citations. Users (especially lawyers) need to know that the text is trustworthy. ("Is this BS?") And Users need to know textual provenance. ("Where did this come from?")

To address this problem, there are (at least) three options:

1. Bullshitter

2. Searcher

3. Researcher

1. BULLSHITTER. LLM hallucinations gone wild. Unchecked chaos.

2. SEARCHER. LLM generates text. Run queries to substantiate (or debunk) the text.

This is like a senior partner saying "I'm pretty sure there's a case out there that says X. Find it!"

Good luck.

3. RESEARCHER. Atomize the <PROPOSITION> + <CITATION> graph. Give each <PROPOSITION> and <CITATION> unique identifiers. User's query builds most-common ground truth (non-hallucinated).

Which to choose?

#1 BULLSHITTER is a nonstarter.

#2 SEARCHER seems obvious. But many rabbit holes.

e.g., sentence = BS hallucinated

e.g., sentence recites bad law (e.g., Plessy, Roe v. Wade)

#3 RESEARCHER is harder to build. But most trustworthy. Built atop ground truth.

Jumi Kassim

2 年

I would expect that most Researchers will turn out to be an adversarial network of a BSer and a Searcher wearing a trenchcoat.

1 次回应

Brian Quinn

Product Manager | Agile | SaaS | B2B

2 年

Good thoughts. I've had the experience of feeding three sequential compliance prompts to ChatGPT, and the third one got no result. That was a bit odd because that item does exist, but made me think that a Confidence Threshold of some sort may be helpful. Zero result may be better than Bullshit. As you well know, the User of these future systems may again be less experienced staff who are at greatest risk of grabbing onto an attractive answer.

2 次回应

Graeme J.

Law + people + messy reality + ways of working + organisations + software + data

2 年

Nicely done, Damien. Succinct and compelling.

1 次回应

查看更多评论

要查看或添加评论，请登录

Damien Riehl的更多文章

Blurred Lines: AI-Generated Art's Copyright Registration in "Selection, Coordination, and Arrangement" — Require "Making Of" Videos?

2025年3月1日

Blurred Lines: AI-Generated Art's Copyright Registration in "Selection, Coordination, and Arrangement" — Require "Making Of" Videos?

AI-generated art might have copyrightable legs. Following the U.

29 条评论
Deepfake Evidence: New GenAI Wine in old Fraud Wineskins?

2024年5月5日

Deepfake Evidence: New GenAI Wine in old Fraud Wineskins?

This week, the ABA Journal interviewed me about Deepfake videos, which let me think more deeply about whether we need…

20 条评论
AI Agents and Knowledge Work: Accelerating Ideas, Commodifying Expressions, Flattening Professions, and Advancing Humanity

2024年3月17日

AI Agents and Knowledge Work: Accelerating Ideas, Commodifying Expressions, Flattening Professions, and Advancing Humanity

Introduction As LLMs continue evolving into agents — as Microsoft, OpenAI, and others develop AI Agents that multiply…

14 条评论
LLM Sourcing of Ground Truth (Facts, News, Law)

2024年2月4日

LLM Sourcing of Ground Truth (Facts, News, Law)

For Democracy, two institutions are foundational: Journalism. Freedom of Press in the Constitution elevate the Fourth…

33 条评论
Ideas→Expression: Narratives vs. Multi-level Bullet Outlines

2023年12月17日

Ideas→Expression: Narratives vs. Multi-level Bullet Outlines

Ideas rule the world. Every great business started with ideas.

30 条评论
Legal Judgment: Exclusively from #Humans, not #LLMs (for now)

2023年12月3日

Legal Judgment: Exclusively from #Humans, not #LLMs (for now)

As we think about how many human tasks #LLMs will supersede, it's almost as though we're going up Maslow's hierarchy of…

9 条评论
Post-LLM "Creativity" = Statistically Unlikely

2023年11月11日

Post-LLM "Creativity" = Statistically Unlikely

As a musician, author, and otherwise #creative person, I make things (e.g.

36 条评论
Copyright Infringement + LLM Lawsuits: Idea-Expression Relief

2023年10月15日

Copyright Infringement + LLM Lawsuits: Idea-Expression Relief

You've probably heard that Google has joined Microsoft and Adobe — in indemnifying their users against #GenerativeAI…

14 条评论
Legal LLM Foundational Models = Truth by Design

2023年5月5日

Legal LLM Foundational Models = Truth by Design

Building upon my prior article on judicial opinions as a source of "truth" — recent #LLM developments (e.g.

10 条评论
Centaurs: On Machines, Humans, and the Efficacy-Cost Matrix!

2023年2月19日

Centaurs: On Machines, Humans, and the Efficacy-Cost Matrix!

Centaurs are a human-machine combination — that performs better than either (1) humans alone or (2) machines alone. The…

12 条评论

See all articles

BSer, Searcher, Researcher: Validating Generative AI Texts (#GPT and other #LLMs)

Damien Riehl

Lawyer + Speaker + Writer + Builder + Mediocre Coder + Musician + VP Solutions Champion

Damien Riehl的更多文章

社区洞察

其他会员也浏览了

DeepThoughts on Whipser is Around

LLM Paper Reading Notes - May 2024

Redefining AGI: Microsoft and OpenAI’s Profit-Centric Benchmark

Watch#3: Literate LLMs, Human Errors and Chains-of-Verification

The Age of Reasoners

Abundance Insider: February 22nd, 2019

o1-preview - What you need to know about OpenAI’s newest model

Artificial Intelligence #202

Artificial Intelligence #225

Artificial Intelligence #225

Damien Riehl的更多文章

Blurred Lines: AI-Generated Art's Copyright Registration in "Selection, Coordination, and Arrangement" — Require "Making Of" Videos?

Deepfake Evidence: New GenAI Wine in old Fraud Wineskins?

AI Agents and Knowledge Work: Accelerating Ideas, Commodifying Expressions, Flattening Professions, and Advancing Humanity

LLM Sourcing of Ground Truth (Facts, News, Law)

Ideas→Expression: Narratives vs. Multi-level Bullet Outlines

Legal Judgment: Exclusively from #Humans, not #LLMs (for now)

Post-LLM "Creativity" = Statistically Unlikely

Copyright Infringement + LLM Lawsuits: Idea-Expression Relief

Legal LLM Foundational Models = Truth by Design

Centaurs: On Machines, Humans, and the Efficacy-Cost Matrix!

社区洞察

其他会员也浏览了

DeepThoughts on Whipser is Around

LLM Paper Reading Notes - May 2024

Redefining AGI: Microsoft and OpenAI’s Profit-Centric Benchmark

Watch#3: Literate LLMs, Human Errors and Chains-of-Verification

The Age of Reasoners

Abundance Insider: February 22nd, 2019

o1-preview - What you need to know about OpenAI’s newest model

Artificial Intelligence #202

Artificial Intelligence #225

Artificial Intelligence #225