A Funny Thing Is Happening with ChatGPT

A Funny Thing Is Happening with ChatGPT

The latest version of ChatGPT, (based on GPT-4) is about a week old, and already something funny is happening.??

A Good Start?

People are finding some pretty cool ways to use ChatGPT to help with their normal work.??

One user gave ChatGPT a description of his company’s mission and a rough job description; GPT wrote a formal job description, and it was posted to a job board minutes later.?

Another user had GPT interview him and write content based on the interview.?

One friend had it draft a wealth management plan.?

Another had it build the financial policy manual for his company, including lists of reports and deliverables, deadlines, and cost estimates.?

People have figured out that they can get GPT to change its writing style by providing examples, and that they can even have it to write legal documents.?

Several people have had it draft marketing emails and website copy to sell their products.?

For the first few days, it seemed the sky was the limit with this amazing new tool.

?

And Some Frustration

In the last couple of days, I’ve increasingly been hearing about problems. Users are getting frustrated with the tool misbehaving and doing unexpected things.

In one case, for a user who was having it create budgets and financial plans, it started changing hourly rates and budget lines on its own. When confronted, it apologized profusely for the mistake.

Other users have been frustrated that it seems to change its mind if you ask it the same question more than once.

One guy was puzzled when he asked it to cut and paste some text into a new response, but it wrote a new paragraph on the same topic instead.

And of course, some users have tricked it into doing nonsensical things – like writing a set of instructions for patching a hole in the wall of a house made entirely of cheese.

?

What’s Happening?

What’s happening here? Is GPT misbehaving? Making mistakes? Deliberately undermining its users?

Without getting into the technical explanation for each of the types of “errors” above (and there are relatively simple explanations based on how GPT works for each of them), I think there is a meta-error happening. And we, the users, are the ones making it.

We have been so impressed with our initial experiences that we are treating GPT as something like an experienced Executive Assistant and expecting that level of performance from it. It is performing well enough that when it misses the mark, we are getting frustrated with it in the same way we would if it were human.

The problem is, GPT isn’t built to be an Executive Assistant. It is built to be a language model. What that means is that it does one thing very well – it predicts what word should come next given the previous couple thousand words, based on a corpus of about 500 billion words of written language (for more on how it does that, see my previous blog post?here.).

It’s worth repeating for emphasis: the only thing GPT does is generate text that looks convincing by modeling a statistical process to predict what word should come next.

What it doesn’t do is… everything else. It doesn’t build plans. It doesn’t apply logic. It doesn’t understand meaning. It doesn’t understand consequences. It doesn’t do rigorous research. It doesn’t formulate new ideas.

But GPT’s text generation model is good enough that?it looks like it does those things. The training data includes many, many examples of those types of writing, so it produces output that is convincing and might even be reasonably accurate. But any accuracy or inaccuracy GPT provides is a?side effect?of what is in the training data.?

So, if you ask GPT for a financial plan, it generates text that?looks like?a financial plan. If you ask it to explain the logic of Zeno’s Paradox, it generates text that?looks like?an essay on that topic. If you ask it to explain what it means to be alive, it generates text that?looks like?an answer to that question. If you ask it to provide references on a certain topic, it generates text that?looks like?a list of references (some of them are decent references, and some of them are rubbish).?

What we are seeing is a disconnect between what we expect GPT to do and what it is actually capable of doing. This disconnect seems to be present in large part because GPT sounds so human that we expect it to behave like a human. GPT may be a victim of its own success. But it turns out that humans do a?lot?more than generate streams of words that sound human. We are expecting way too much from GPT.?

It is important to understand what GPT does, and what it doesn’t do. The reactions of some pretty smart people experimenting with the tool suggest that together, we (the users and OpenAI) haven’t done well at communicating the limitations of GPT. I hope we become more educated users soon – before we start treating GPT as if it were an authority on any important topic.

Perhaps GPT has passed the Turing Test, at least well enough that we are beginning to treat it as if it were an underperforming co-worker. It will be interesting to see what happens over the next few months.

要查看或添加评论,请登录

Scott Erb的更多文章

  • AI Safety: An interview with ChatGPT-4 about Stuart Russell's "Provably Beneficial AI"

    AI Safety: An interview with ChatGPT-4 about Stuart Russell's "Provably Beneficial AI"

    With all of the recent advancements in AI, especially the November release of ChatGPT and last week's upgrade of the…

  • What is ChatGPT and How Does It Work?

    What is ChatGPT and How Does It Work?

    What is ChatGPT? ChatGPT has been in the news a lot recently, and we’re going to hear a lot more about it in the near…

  • Mere Mortals and World Class Performance

    Mere Mortals and World Class Performance

    The quarantine has given many of us more unstructured time than we are used to. The question is what to do with it… I…

  • AI Ethics

    AI Ethics

    The Defense Innovation Board recently published a white paper with its list of five ethical principles for the use of…

  • Are you the single point of failure in your organization?

    Are you the single point of failure in your organization?

    In July and August of 1988, along with 1350 of my Classmates, I was required to memorize The Law of the Navy by Admiral…

    1 条评论
  • The Paradox of Trust

    The Paradox of Trust

    How many times have you heard the phrases 'earning trust' and 'building trust'? What if there were approaches that…

  • Will Neuralink Fail?

    Will Neuralink Fail?

    Antonio Regalado at MIT Tech Review writes on the reasons that he thinks than Elon Musk’s predictions for Neuralink are…

  • Intelligence Augmentation

    Intelligence Augmentation

    Nicki Case summarizes the growing importance of Intelligence Augmentation, or humans and computers working together in…

  • Paradox of History

    Paradox of History

    Charles Chu (The Polymath Project) discusses the paradox of history proposed by Yuval Harari (Sapiens, Homo Deus). Key…

  • Butterfly Effect

    Butterfly Effect

    Shane Parrish (Farnam Streen Blog) wrote an excellent essay on the butterfly effect and how it impacts science…

社区洞察

其他会员也浏览了