How ChatGPT is Accelerating AI
Midjourney – AI-powered software will create feedback loops between people and machines that radically accelerate AI

How ChatGPT is Accelerating AI

A little over 6 months ago, I kicked off this newsletter with a short post about how language understanding was about to transform software.

In subsequent posts, I did my best to articulate a concrete, compelling vision of what software built on language looks like.

I talked about how language can be used to compose many different types of functionality. I talked about how AI-powered, language-aware software will go well beyond just chat. Language will be used to power GUIs. Language models will be able to see, understand and generate multimedia content.

Most recently, in early November, I highlighted a growing ecosystem of tools (e.g. LangChain) that were making it easier for engineers and entrepreurs to implement the design ideas I and others were experimenting with.

Last year, much of what I wrote was speculative. Since my last post on November 5th, however, speculation is no longer required. We now have clear evidence that AI-powered language understanding is changing the trajectory of software.

The evidence is ChatGPT.

GPT enters the Chat

OpenAI announced GPT-3 in May 2020 and eliminated it's waitlist-gated signup in late 2021. Also in 2021, OpenAI released a modified version of GPT-3, Codex, focused on code completion. Codex soon went on to power Github Copilot, firming up OpenAI's deepening partnership to Microsoft.

In early 2022, GPT-3 got a significant performance improvement with the release of InstructGPT, a model that was trained to better follow user issued instructions. In mid-2022, GPT-3 and Codex were updated several more times with further performance enhancements.

For 2 1/2 years GPT-3 got better and better at following instructions, but its primary interface remained unchanged. It was an API which could be easily called from Python, as well as used via OpenAI's web-based "playground" interface, which enabled developers to easily type and testing prompts.

By machine learning standards, GPT-3 was user-friendly, but by consumer software standards, it was not. Ordinary people were not going to use a developer playground as an interface. Web apps that did provide better interfaces were constrained by both costs and terms of service. Since calls to GPT were too costly to be ad supported, apps built on top of it largely targeted commercial copywriters.

Compared to publicly available text-to-image models (e.g. DALL-E and stable diffusion), GPT-3's API usage did not appear to embody Silicon Valley's beloved "hockey stick" growth curve. Growth was steady, but slow... until it wasn't.

On November 30th, OpenAI released a free-to-use version of ChatGPT. Conceptually, ChatGPT was simply a chat-based interface that let users interact directly with an updated version of GPT-3. In practice, the interface made all the difference.

Suddenly ordinary people could talk to GPT-3 as easily as they could text a friend. Remarkably, when you asked ChatGPT a question, or issued it a feasible request, ChatGPT would cheerfully comply. It would rapidly type out a clear, well-formatted response and if you wanted edits or followup, all you had to do was ask.

As word of ChatGPT rapidly spread, over 1M unique users signed up in 5 days. This was arguably the fastest acquisition of 1M users in software history.

No alt text provided for this image

Users of ChatGPT quickly discovered that it could flexibly respond to a wide range of request. For example, in this @aifunhouse twitter thread, I had a great time prompting it to help me kickoff an entire online business. I asked it for product advice, a marketing strategy, and had I gone further, I could have even asked it to help me generate code for my hypothetical website.

In addition to performing tasks (e.g. generating and editing text) AI-powered, interactive chat turns out to be a wonderful way to learn about a topic. Unlike with Google search, you don't ever need to go hunting around for an answer. Instead, ChatGPT (when it works) tells you whatever you want to know right in the chat.

The main shortcoming of ChatGPT is that it isn't currently able to search the internet on your behalf. Instead, it just reads the current conversation and produces a response from its gargantuan memory. GPT-3 is able to internally store an impressive breadth of knowledge, but it also has a troublesome tendency to make up information. It absolutely can't be trusted to point you to specific products, websites, articles, etc...

GPT-3's lack of external knowledge is a temporary setback. Well before the release of ChatGPT, OpenAI released a paper on WebGPT. This modified version of GPT-3 first generates web queries which, in turn, are used to pull relevant text from the internet before generating a final response. WebGPT is basically just a different way of using GPT-3. It's a simple adaptation that ChatGPT can and will copy.

From 1 million to 1 billion

Two months after its release, ChatGPT appears to have hit 100M monthly users. Almost immediately, it was being called a "code red" for Google, causing immediate pivots inside the company. Google is no slouch when it comes to AI. Its research on large language models is world class, but the company has been slow to integrate these models into actual products. From a pure business perspective, ChatGPT's release is rapidly accelerating the deployment of AI within big tech.

Microsoft's multi-billion dollar investment in OpenAI has resulted in a tight partnership culminating in yesterday's big announcement that:

  1. Bing is getting a ChatGPT-style interface. As with WebGPT, user queries result in search results which GPT then uses to generate responses (including citations!).
  2. Microsoft's Edge browser is getting a ChatGPT-style assistant. The assistant allows you to ask questions, and in some cases perform actions, based on the webpage you are currently viewing.

Yesterday's announcement is both a first step, and a very big deal. All of a sudden GPT, and language-powered software more generally, has a clear path to 1 billion users. Even if you're a Bing-hating skeptic, Google is now also scrambling to launch it's own AI-powered search interface called Bard (but I mean, if it's actually good, won't we end up just calling it Google?).

Between Bing and Google, the odds are high that a year from now, over 1 billion people will be regularly chatting with AI. These conversational interactions won't be limited to search, and they will significantly accelerate AI's already fast pace of progress.

Faster Feedback, Accelerating AI

In 2022, it became clear that generative AI models (e.g. GPT3, Copilot, DALL-E, Stable Diffusion) were now capable of turning words in to useful outputs (e.g. text, code, images, etc...). As a result, software UIs were poised to transition from "command-driven" to "intent-driven". Tell software what you want, then get it.

For almost all of 2022, even those of us actively using language-powered AI were communicating intent with "one-shot" UIs. We'd type out a prompt, we'd get an output. If we didn't like the output, we'd try again. ChatGPT's UX isn't one shot, however, it's a dialogue.

Users make requests, then refine these requests through follow up interactions. In doing so, users provide both implicit and explicit feedback. ChatGPT's adoption is conditioning us to talk to computers in a way we never have. This is a big deal. It's not just a way for users to get the results they want, it's also a way for AI to learn.

Chat-style interaction are about to be built in to lots of different software. In addition to browsers, AI-powered chat is already starting to power image editors and coding tools.

AI researchers don't know whether today's AI architectures (i.e. transformers and diffusion models) are fundamental to AI. In 5 years, our current crop of breakthrough model architectures could plausibly be replaced, Language-powered interaction, however, will remain. Language is fundamental to understanding our world. It's fundamental to how both people and AI learn.

Wide adoption of ChatGPT-like interfaces will radically accelerate the feedback loop between people and software. For the first time, users will find themselves actively explaining to software what they want, and how to do better. By 2024, language-powered feedback will be fueling AI progress.

Even in the case of ChatGPT, OpenAI has openly stated that it's release wasn't enabled by a better architecture or more training data. It was enabled through large amounts of human-in-the-loop feedback. Learning from human feedback will be baked in to all AI-powered products. The products and companies that succeed will be those that leverage this feedback most effectively.

Independent of whatever technical breakthroughs like ahead, 2023 is now the year where language-powered UXs began fueling AI-powered software at a global scale.

Once again, awesome post. I've been doing some R&D, using Langchain, OpenAI, embeddings, text to speech, speech to text. It's a grea time to be alive - and a little scary

Krista Barron

Possibility Creator: Birth-to-5, UPK and Head Start systems | P-12 Policy, governance and portfolio management | School design | Portrait of a Learner development and strategic planning | Adult learning

1 年

I honestly didn't know what you were talking about back in the summer. A bunch of techy, future mumbo-jumbo. And then it wasn't. And now it's everywhere as though it'd always been there. I guess that's what we can deem a culture-changing breakthrough.

Crystal Valentine

CEO at Valentine Ventures

1 年

Really great perspective, Eric Rachlin. Enjoyed reading. How many interactions with ChatGPT did it take for you to produce this piece? Jk. (And thank you for omitting the oh-so-common “the images in this blog were generated with Stable Diffusion”…..it’s been done.)

Eric Rachlin Awesome! Thanks for Sharing! ?

要查看或添加评论,请登录

社区洞察

其他会员也浏览了