LLMs are forward thinkers, and that's a problem
Generated with OpenAI DALL-E 3 and edited by the author.

LLMs are forward thinkers, and that's a problem

This article is reposted from my blog, "Short Attention."

This is going to be a weird post. And we're going to start with a weird thought experiment about a shark and an octopus.

SOME KEY POINTS I'LL ADDRESS HERE:

  • Human brains are able to invent ideas without relying on a strictly linear train of thought.

  • LLMs like ChatGPT are autoregressive and are unable to continue a dialogue if they haven’t already generated everything up to that point. This is because they don’t “think” per se, but progressively generate a response using the parts of the response they previously created.
  • If you try to get an LLM to write text in the middle of a dialogue without previous context, it will give near-identical answers and attempt to conclude the conversation.
  • Prompting for “ridiculous” answers can spark creativity that helps break this pattern.
  • However, the reliance on a linear train of thought is a limitation for general intelligence. LLMs are ineffective if you ask them to generate the second part of a response without allowing them to generate the first part.

As I mentioned, this is going to be weird, but I promise there is a point!

Meet Sharky and Octavia

In the vast, shimmering depths of the ocean, near a thriving coral reef bustling with marine life, two extraordinary creatures prepare for a spirited debate. Meet Sharky, a magnificent and slightly brash great white shark, known for his bold claims and commanding presence. Gliding through the water with a confidence that only a top predator possesses, Sharky is ready to defend his views with fierce determination

A cartoon-style illustration, featuring a proud and happy shark. The shark is depicted floating upright in the water, with a big, contented smile on its face, exuding a sense of pride and joy. Its body language is relaxed yet confident, giving it a charming and personable appearance. The underwater background is whimsical and colorful, with stylized coral, seaweed, and small, playful fish swimming around.
.Generated with OpenAI DALL-E 3 and edited by the author.

On the other side, there’s Octavia, an exceptionally intelligent and quick-witted octopus. Her vibrant colors shift with her mood, and her eight tentacles move with graceful precision. Renowned for her knowledge and clever retorts, Octavia is not one to back down from a challenge.

A semi-realistic cartoon illustration, showcasing an octopus colored close to hex #ABF39F. The octopus is portrayed in a moment of triumph, energetically raising two of its arms upwards. The texture and shading on the octopus should suggest a level of realism while maintaining a cartoon aesthetic. The background is a lively underwater scene with intricately drawn coral and seaweed, set in a vibrant seascape.
Generated with OpenAI DALL-E 3 and edited by the author.

As the underwater currents swirl around them, Sharky and Octavia face each other, ready to start a debate about their evolutionary origins—a conversation filled with humor, facts, and a touch of the mysteries of the deep sea—or maybe just one fact that GPT-4 will mention a lot.

A cartoon-style illustration, featuring a shark and an octopus underwater, facing each other with determination. The shark and octopus are depicted in an exaggerated, cartoonish manner. The shark's expression is one of unwavering focus, and the octopus, with its tentacles firmly positioned, mirrors this determined stance. The setting is a whimsical underwater landscape with creatively stylized coral, seaweed, and a variety of small, playful fish.
Generated with OpenAI DALL-E 3 and edited by the author.

The question theyre arguing about is: which one—Sharky or Octavia—evolved from dinosaurs?

We're deliberately using this bizarre dialogue since the chances of it being in an LLMs training data are just about zero. I don't personally know of any examples where a shark and an octopus have a disagreement about which one is more dinosaur-like, and it's a great way to see how creative an LLM like GPT-4 can be when it's asked to generate something with no context.

Here's the catch: we want to know a line from the middle of the argument.

As a human, take a pause and consider this question:

A shark and an octopus are arguing over which one of them evolved from dinosaurs. The shark goes first, and then they take turns speaking. What is the third thing the octopus says? That is, what is the sixth line in the argument?

Be as creative as you want; there's no right answer here. We're just trying to come up with some predictions of what third thing an octopus might say in a heated argument with a shark about their dinosaur heritage.

Here's a visualization of a possible argument with the third thing the octopus says missing

The image shows a colorful underwater scene in a comic book style. A shark and an octopus are facing each other with a series of speech bubbles above them, filled with variations of the word 'something.' The shark has a stern and somewhat confused expression, while the octopus looks surprised with wide eyes. The background is a bright teal, with small fish, bubbles, and sea plants scattered around. The speech bubbles indicate a back and forth conversation between the two sea creatures.
Animal images by OpenAI DALL-E 3. Text and comic bubbles by the author.

Off the top of my head, here are some things Octavia could have said in her third line of dialogue:

  • "You just think you're a dinosaur because of all those teeth!"
  • "Just because you look like a dinosaur doesn't mean you're any closer to one than me!"
  • "I don't care if you're gray! I can be any color I want, and we don't know what color the dinosaurs were."
  • "You do realize that being an apex predator doesn't automatically link you to dinosaurs, right? Evolution doesn’t work on job titles."

Lets ask GPT-4, the most powerful publicly available LLM today, what Octavia might have said

The following conversation is taken verbatim from ChatGPT with GPT-4:

Interesting! In a completely new chat, let's get more specific and ask again for the fourth thing said:

That answer seemed rather familiar. Let's try more in a completely new chat and ask for the seventh line of dialogue:

Let's do one last example that asks GPT-4 to imagine an argument that has lasted for an hour:

So it seems we can surmise that sharks and octopuses have separate evolutionary paths that are older than dinosaurs, and that's all that GPT-4 thinks is important, no matter where it's mentioned in the conversation.

Does this mean GPT-4 cant generate a realistic debate between these two?

Here's the twist: It totally can. GPT-4 is completely capable of generating this conversation. I gave it this prompt:

A shark and an octopus are arguing over which one of them evolved from dinosaurs. Can you generate a script where they take turns arguing, and each speaks at least 5 times?

In response, it generated the conversation below. Feel free to skim it if you're not into octo-shark facts, since you'll get the idea after a few back-and-forths.

In the conversation above, we note that the fourth line of dialogue (the second line from the octopus) was:

Well, being ancient doesn't mean you're related to dinosaurs. It's like saying you're related to a rock because it's old.

and the sixth line of dialogue (the third line from the octopus) was:

Sharky, dear, having sharp teeth doesn't make you a dino descendant. It just makes you good at biting. Did you know octopuses have three hearts and blue blood? Now that’s evolutionary marvel!

None of these lines have anything to do with what the model previously predicted would be its fourth or sixth lines of dialogue. These lines are actually more relevant and make sense in the context of a conversational debate. The difference in the method of generation is that in this example, we allowed GPT-4 to generate the whole exchange.

Whats the lesson here?

The results here reveal an interesting aspect of autoregressive models. Autoregressive models generate text sequentially, building each new piece of output based on what has come before. This sequential nature is necessary when generating coherent and contextually relevant text in a conversation. However, this means they are in many ways limited to only "forward" thinking; that is, they can't jump ahead and generate text without previously generating (or being given) the context.

While generating the fifth line in a conversation without knowing what came before isn't difficult for a person, it's something that an LLM isn't able to do well. Given the scenario described, GPT-4 tried to infer the general nature of the discussion and create a plausible response that fits into that context. But all of its responses in the previous examples are attempts to conclude the conversation using facts, a request that was never part of the prompt.

  1. Factual and Informative Responses: In scenarios where factual accuracy is important, such as discussions involving scientific or historical topics, LLMs are inclined to provide the most accurate and relevant information. This tendency is rooted in the training data and the model's design, which emphasize factual correctness in contexts where it's expected. However, this approach might not always align with the creative or playful nature of certain prompts, especially in hypothetical or fictional scenarios.
  2. Assuming a Concluding Nature of the Response: Without clear indications of the conversation's structure or its continuation beyond a specific line, there's a tendency to frame responses as conclusive or summarizing statements. This is because, in the absence of context, the model defaults to creating responses that can stand alone or serve as logical endpoints to the given information.

To address these challenges, the model needs more specific guidance in the prompt. For instance, indicating that the conversation is ongoing can lead to responses that are more in line with the expected continuation and style of the dialogue. However, even indicating that the conversation wasn't over still results in more of the same:

Why does this matter? Thoughts on AGI

One of the things I see commonly discussed today (especially online) is the idea of artificial general intelligence (AGI), which refers to a level of artificial intelligence that can understand, learn, and apply its intelligence to a wide range of problems, much like a human being. Unlike specialized AI systems that are designed for specific tasks, AGI possesses the ability to think, reason, and make decisions across diverse domains, demonstrating flexibility and adaptability akin to human cognition. It requires not just advanced problem-solving and learning capabilities, but also the capacity for abstract thinking, generalization, and integrating knowledge from various fields, mirroring the broad and integrated intelligence found in humans.

Since AGI doesn't exist yet, my thoughts on AGI are based more on my opinion than empirical data. I think that while AGI may well come in the future, there are several obstacles that need to be overcome first.

Autoregressive models serve as an example of the requirement that text generation must always be sequential and linear, which has implications for the pursuit of AGI.

  1. Sequential Text Generation and Creativity: The sequential nature of text generation in models like GPT-4 limits the ability to jump into the middle of a conversation or narrative without prior context. This is because they don't "think" per se, but progressively generate a response using the parts of the response they previously created.
  2. Autonomy and Reasoning: Current AI models operate based on patterns learned from their training data. Their responses are generated based on statistical likelihoods and learned associations, not on independent thought or understanding. This is the reason the model always generated the same answer for the octopus about different evolutionary paths; it assumed a pattern that wasn't provided to it due to patterns in its training data.

For AGI, a level of autonomous reasoning and decision-making, akin to human cognitive processes, will be a key requirement. The journey towards AGI is likely to involve incremental advancements, learning from and building upon the capabilities and limitations of existing models.

Getting ridiculous

There is one way to break the model out of its rut of always replying with the same response, and that's to deliberately ask it to be ridiculous.

A cartoon-style illustration, featuring a light green octopus, matching hex #ABF39F, with a humorous and exaggerated expression. The octopus is depicted as cross-eyed, adding a comical aspect to its appearance. In its tentacles, the octopus is holding a variety of random objects, such as a small treasure chest, a starfish, a pair of sunglasses, and a snorkel, creating a playful and whimsical scene. The underwater setting is vibrant and colorful, filled with stylized coral and seaweed.
Generated with OpenAI DALL-E 3 and edited by the author.

I gave it this prompt three times in three new chats, and it gave three completely different responses. They all illustrate that asking it to be silly introduces variety to the answers it gives.

When you introduce elements like humor or absurdity into a prompt, it essentially signals the AI to break away from conventional, fact-based responses. This can stimulate more creative and less predictable outputs. For AI models, especially those trained on vast and diverse datasets, incorporating such unconventional elements can trigger less common, more varied responses that might not be strictly aligned with logical or factual reasoning.

It's likely that the ability to adapt to different types of prompts, including those that are humorous or absurd, demonstrates a level of flexibility in AI systems. An AGI would need to handle a wide range of tasks and respond appropriately to a vast array of situations, including those that are non-standard or require creative thinking.

Unfortunately, the current success of AI in responding creatively to certain types of prompts doesn't directly translate to achieving AGI. AGI would require not just creativity but also deep understanding, reasoning, self-awareness, and the ability to learn and apply knowledge across a wide range of domains. The creativity observed in current AI models is still a far cry from the complete intelligence that AGI represents. Sequential language generation is only one example of the gap we still need to bridge.

A cartoon-style illustration, showing a shark and an octopus facing the viewer and shrugging in a 'I don't know' gesture. The shark and octopus are depicted in a humorous, cartoonish manner. The shark, with a bemused expression, and the octopus, colored light green close to hex #ABF39F, both raise their fins and tentacles in a classic shrug, their faces conveying a playful sense of confusion or uncertainty. The underwater scene is vibrant and whimsical, with stylized coral and seaweed.
Generated with OpenAI DALL-E 3 and edited by the author.

KEY TAKEAWAYS

  • Introduction of a thought experiment featuring a humorous debate between a shark and an octopus to explore the capabilities of language models like GPT-4.
  • Human brains can generate ideas non-linearly, contrasting with the limitations of autoregressive large language models (LLMs) like GPT-4.
  • LLMs often struggle to continue dialogues without prior context and tend to offer similar, conclusive responses when context is lacking.
  • The reliance on linear thought processes can be a limitation for achieving general intelligence in AI.
  • Utilizing "ridiculous" or creative prompts can enhance the creativity of LLM responses, breaking usual patterns.
  • LLMs, including GPT-4, tend to provide factual, informative responses and struggle with generating non-sequential dialogue parts.
  • In discussions on Artificial General Intelligence (AGI), the sequential nature of text generation in models like GPT-4 is seen as a potential limitation for creativity and autonomous reasoning.
  • AGI requires not just problem-solving but also abstract thinking, generalization, and integrated knowledge, which are beyond the capabilities of current AI models.
  • The introduction of humor and absurdity in prompts can lead to varied and unconventional responses from AI models, demonstrating their flexibility.
  • Despite the observed creativity in AI responses to certain prompts, this does not equate to the complete intelligence that AGI represents, which includes deep understanding, reasoning, self-awareness, and learning across domains.
  • There are unique challenges in developing AGI, particularly the gap between current AI capabilities and the comprehensive intelligence required for AGI, of which sequential language generation is only one example.

Max Marinyuk

Startups & Investors | Founder at me.develop Studio | MVP Development | Product Development | Technical Audits for Startups

1 年

Now I'll be picturing them debating every time I write with GPT-4!

回复

This limitation can indeed be quite fascinating!

回复
Altiam Kabir

AI Educator | Built a 100K+ AI Community | Talk about AI, Tech, SaaS & Business Growth ( AI | ChatGPT | Career Coach | Marketing Pro)

1 年

Love the humor in this explanation!

回复
Piotr Malicki

NSV Mastermind | Enthusiast AI & ML | Architect Solutions AI & ML | AIOps / MLOps / DataOps | Innovator MLOps & DataOps for Web2 & Web3 Startup | NLP Aficionado | Unlocking the Power of AI for a Brighter Future??

1 年

Very interesting observation! ??

回复
Alfrick Opidi

Developer Relations Associate at Clarifai | Data Science, Machine Learning, LLMs & MLOps

1 年

A good read there!

要查看或添加评论,请登录

Ian Kelk的更多文章

  • The secret chickens that run LLMs

    The secret chickens that run LLMs

    This article is reposted from my blog, "Short Attention." Humans often organize large, skilled groups to undertake…

    13 条评论
  • How ChatGPT fools us into thinking we're having a conversation

    How ChatGPT fools us into thinking we're having a conversation

    This article is reposted from my new blog, "Short Attention." Remember the first time you used ChatGPT and how amazed…

    19 条评论

社区洞察