Moving Faster and Breaking More Things with ChatGPT
With generative AI we have entered a new era of "move fast and break things" and it's a mixed blessing.
There's a new way to ship software even faster: Retrieval Augmented Generation (RAG), which is all the rage in generative AI. Throw some content in a vector database or build an LLM plugin (a.k.a., tool or callback). ChatGPT will do the rest.
There's a catch: variable user experience.
Generative AI chat has a superpower: it's a universal interface because it mimics conversation and we can get almost anything done that way. Technically, the chat interface is deceptively narrow: just some text getting passed back and forth. Logically, it is very broad: the text can be anything. That is how generative AI's greatest strength becomes a great weakness: the risk of poor user experience on specific tasks.
A recent HBS paper called it the jagged frontier of AI . If you don't want to read the whole paper, read my friend and AI pundit Rob May 's summary . The basic idea is this:
领英推荐
On some tasks AI is immensely powerful, and on others it fails completely or subtly. And, unless you use AI a lot, you won’t know which is which.
The jagged frontier problem is compounded with RAG because of the interaction between the LLM and its tools.
You can test your plugin. You can test your plugin with LLM(s) using the prompts you can think of. However, you don't control the ChatGPT Plus logic for when and how it uses your plugin. Further, you can't test on all the prompts users might give.
RAG may be the fastest way yet to ship software ... with a very variable user experience.
Just today I saw a press release about a new ChatGPT Plus plugin. It claimed to help ChatGPT give accurate information about payments to doctors. One could query the information about a doctor using their National Provider Identifier (NPI). As a CTO in health AI, I was curious and gave it a try. Alas, I couldn't get the most basic questions answered. ChatGPT correctly delegated to the plugin, but the output was ... less than useful.
Generative AI and RAG may let us ship faster, which is great for experimentation, but they don't necessarily help us ship better. That remains our responsibility.
Account Executive at Full Throttle Falato Leads - We can safely send over 20,000 emails and 9,000 LinkedIn Inmails per month for lead generation
4 个月Simeon, thanks for sharing! Would love to learn more...
Founder | Consultant | Advisor | Board Member
1 年We should try, "move deliberately and fix things". We're rushing into this with risks of variable UX, variable privacy, variable security, variable prediction, variable outcomes, and limited thought on the broad societal impact this will have.