My Take on ChatGPT and LLMs
A legal technology colleague asked for my opinion of ChatGPT today. Below is a slightly edited version of my email to them.
ChatGPT itself is an interactive wrapper around a large language model (LLM).?There's some cool things around training for dialogues that were done to build that wrapper, and those are perhaps more important in the long run.?But the current excitement is mostly around a wide public being exposed to LLMs for the first time.??
What AI people have known for the past decade, and everyone is seeing now, is that?you can build a very good statistical model of what coherent text looks like at the surface level.?That lets you generate superficially plausible text on pretty much any topic.?I think of it as super-autocomplete.?
LLMs don't understand anything, so the only way they create true / meaningful / whatever text is, roughly speaking, if enough of the text they were trained on said true / meaningful / whatever things related to the topic you're "autocompleting" text on.?One analogy is the half-drunk guy at the party who is an expert on every topic, but is only sort of accidentally correct in what they say.?But even that person knows some things and intends some effect. LLMs know nothing and intend nothing.?
The Implications of Infinite Fluent Nonsense
But having machines generate fluent nonsense on any topic has exposed a zero-day bug in human society: there are big areas of life where we make the implicit assumption that fluent language can only be generated by human beings. There's at least two big consequences of that assumption:
(1) We've assumed it's not possible for bad actors to generate large amounts of language, because human beings are limited. Now bad actors will be able to generate an effectively infinite amount of bad content in any system they have access to.?We need to figure out how to run a society where there's a million evil language robots for every human being.??
领英推荐
This has mostly been discussed in the context of social media, but that's just the beginning.?It will soon be possible, for instance, to create software that would do nothing but attempt to imitate every person in the world one at a time, call up their financial institutions, and try to talk that institution into withdrawing your money.?Or email every person in your recently hacked address book, do a plausible imitation of your personal style, and destroy your relationships with every one of them.?Etc, etc.?If you thought passwords, identity verification, etc. were obtrusive now, you haven't seen anything.???
(2) We've used the ability to generate language as a surrogate for understanding and competence, in educational testing and professional certification.?That's very quickly going to be untenable unless you've got the person in a Faraday cage.?We're going to have to come up with new ways of evaluating people, and it's going to be very expensive.?
Some Mundane Implications for Legal Tech
That's the downsides.?On the plus side, what is super-autocomplete good for???Well, there's a lot of formulaic documents that need to be generated in the world, and of course in the law.?LLMs will be hugely useful in checking over and eventually generating those documents.??
However, there's a lot of task-specific engineering work to do there.?And a lot of careful looks at costs and benefits.?It's a common misconception, going back to early days of machine translation, that cleaning up bad or incorrect machine-generated language is much cheaper than creating text from scratch.?It can actually be more expensive, or only marginally cheaper.?Further, many legal documents in particular are not just documents, they are signals that the attorney has developed a deep understanding of the client's situation, and its that understanding that is being paid for.??
Dave
Thanks for fair and measured take on this topic, Dave!
Dave, Thanks for this. I've already been sharing :) I suspect we are backing ourselves into a "dataverse" that will require the introduction of some kind of public provenance blockchain to root every graph back to a real individual human, asserting something at a particular moment in real space. My assumption is that until we anchor the fundamental structure of the Internet in real human terms, on purpose, we will continue to experience "medium muddying the message" problems.
I enjoy being creative with the ChatGPT, and wonder if it wants to write a counter-argument to this article :) Well done, Dave, another excellent insight.
Mission: To bring AGI Benefits to Humanity | Scaling Aigo.ai to AGI to boost Human Flourishing | Going Beyond LLMs using Cognitive AI | Speaking with asymmetric ‘Aligned’ Lead Investors - Series A
2 年Must read. Thanks for sharing Dave Lewis. #ChatGPT is Super - Autocomplete #LLMs don’t ‘Understand’ anything #ChatGPT creates infinite ‘Fluent Nonsense’ These three lines capture it all.
Senior Legal Skills Professor and Director, UF Law International Center for Automated Information Retrieval
2 年Brilliant article.