LLMs are forward thinkers, and that's a problem
This article is reposted from my blog, "Short Attention."
This is going to be a weird post. And we're going to start with a weird thought experiment about a shark and an octopus.
SOME KEY POINTS I'LL ADDRESS HERE:
As I mentioned, this is going to be weird, but I promise there is a point!
Meet Sharky and Octavia
In the vast, shimmering depths of the ocean, near a thriving coral reef bustling with marine life, two extraordinary creatures prepare for a spirited debate. Meet Sharky, a magnificent and slightly brash great white shark, known for his bold claims and commanding presence. Gliding through the water with a confidence that only a top predator possesses, Sharky is ready to defend his views with fierce determination
On the other side, there’s Octavia, an exceptionally intelligent and quick-witted octopus. Her vibrant colors shift with her mood, and her eight tentacles move with graceful precision. Renowned for her knowledge and clever retorts, Octavia is not one to back down from a challenge.
As the underwater currents swirl around them, Sharky and Octavia face each other, ready to start a debate about their evolutionary origins—a conversation filled with humor, facts, and a touch of the mysteries of the deep sea—or maybe just one fact that GPT-4 will mention a lot.
The question theyre arguing about is: which one—Sharky or Octavia—evolved from dinosaurs?
We're deliberately using this bizarre dialogue since the chances of it being in an LLMs training data are just about zero. I don't personally know of any examples where a shark and an octopus have a disagreement about which one is more dinosaur-like, and it's a great way to see how creative an LLM like GPT-4 can be when it's asked to generate something with no context.
Here's the catch: we want to know a line from the middle of the argument.
As a human, take a pause and consider this question:
A shark and an octopus are arguing over which one of them evolved from dinosaurs. The shark goes first, and then they take turns speaking. What is the third thing the octopus says? That is, what is the sixth line in the argument?
Be as creative as you want; there's no right answer here. We're just trying to come up with some predictions of what third thing an octopus might say in a heated argument with a shark about their dinosaur heritage.
Here's a visualization of a possible argument with the third thing the octopus says missing
Off the top of my head, here are some things Octavia could have said in her third line of dialogue:
Lets ask GPT-4, the most powerful publicly available LLM today, what Octavia might have said
The following conversation is taken verbatim from ChatGPT with GPT-4:
Interesting! In a completely new chat, let's get more specific and ask again for the fourth thing said:
That answer seemed rather familiar. Let's try more in a completely new chat and ask for the seventh line of dialogue:
Let's do one last example that asks GPT-4 to imagine an argument that has lasted for an hour:
So it seems we can surmise that sharks and octopuses have separate evolutionary paths that are older than dinosaurs, and that's all that GPT-4 thinks is important, no matter where it's mentioned in the conversation.
Does this mean GPT-4 cant generate a realistic debate between these two?
Here's the twist: It totally can. GPT-4 is completely capable of generating this conversation. I gave it this prompt:
A shark and an octopus are arguing over which one of them evolved from dinosaurs. Can you generate a script where they take turns arguing, and each speaks at least 5 times?
In response, it generated the conversation below. Feel free to skim it if you're not into octo-shark facts, since you'll get the idea after a few back-and-forths.
In the conversation above, we note that the fourth line of dialogue (the second line from the octopus) was:
Well, being ancient doesn't mean you're related to dinosaurs. It's like saying you're related to a rock because it's old.
and the sixth line of dialogue (the third line from the octopus) was:
Sharky, dear, having sharp teeth doesn't make you a dino descendant. It just makes you good at biting. Did you know octopuses have three hearts and blue blood? Now that’s evolutionary marvel!
None of these lines have anything to do with what the model previously predicted would be its fourth or sixth lines of dialogue. These lines are actually more relevant and make sense in the context of a conversational debate. The difference in the method of generation is that in this example, we allowed GPT-4 to generate the whole exchange.
Whats the lesson here?
The results here reveal an interesting aspect of autoregressive models. Autoregressive models generate text sequentially, building each new piece of output based on what has come before. This sequential nature is necessary when generating coherent and contextually relevant text in a conversation. However, this means they are in many ways limited to only "forward" thinking; that is, they can't jump ahead and generate text without previously generating (or being given) the context.
While generating the fifth line in a conversation without knowing what came before isn't difficult for a person, it's something that an LLM isn't able to do well. Given the scenario described, GPT-4 tried to infer the general nature of the discussion and create a plausible response that fits into that context. But all of its responses in the previous examples are attempts to conclude the conversation using facts, a request that was never part of the prompt.
To address these challenges, the model needs more specific guidance in the prompt. For instance, indicating that the conversation is ongoing can lead to responses that are more in line with the expected continuation and style of the dialogue. However, even indicating that the conversation wasn't over still results in more of the same:
Why does this matter? Thoughts on AGI
One of the things I see commonly discussed today (especially online) is the idea of artificial general intelligence (AGI), which refers to a level of artificial intelligence that can understand, learn, and apply its intelligence to a wide range of problems, much like a human being. Unlike specialized AI systems that are designed for specific tasks, AGI possesses the ability to think, reason, and make decisions across diverse domains, demonstrating flexibility and adaptability akin to human cognition. It requires not just advanced problem-solving and learning capabilities, but also the capacity for abstract thinking, generalization, and integrating knowledge from various fields, mirroring the broad and integrated intelligence found in humans.
Since AGI doesn't exist yet, my thoughts on AGI are based more on my opinion than empirical data. I think that while AGI may well come in the future, there are several obstacles that need to be overcome first.
Autoregressive models serve as an example of the requirement that text generation must always be sequential and linear, which has implications for the pursuit of AGI.
For AGI, a level of autonomous reasoning and decision-making, akin to human cognitive processes, will be a key requirement. The journey towards AGI is likely to involve incremental advancements, learning from and building upon the capabilities and limitations of existing models.
Getting ridiculous
There is one way to break the model out of its rut of always replying with the same response, and that's to deliberately ask it to be ridiculous.
I gave it this prompt three times in three new chats, and it gave three completely different responses. They all illustrate that asking it to be silly introduces variety to the answers it gives.
When you introduce elements like humor or absurdity into a prompt, it essentially signals the AI to break away from conventional, fact-based responses. This can stimulate more creative and less predictable outputs. For AI models, especially those trained on vast and diverse datasets, incorporating such unconventional elements can trigger less common, more varied responses that might not be strictly aligned with logical or factual reasoning.
It's likely that the ability to adapt to different types of prompts, including those that are humorous or absurd, demonstrates a level of flexibility in AI systems. An AGI would need to handle a wide range of tasks and respond appropriately to a vast array of situations, including those that are non-standard or require creative thinking.
Unfortunately, the current success of AI in responding creatively to certain types of prompts doesn't directly translate to achieving AGI. AGI would require not just creativity but also deep understanding, reasoning, self-awareness, and the ability to learn and apply knowledge across a wide range of domains. The creativity observed in current AI models is still a far cry from the complete intelligence that AGI represents. Sequential language generation is only one example of the gap we still need to bridge.
KEY TAKEAWAYS
Startups & Investors | Founder at me.develop Studio | MVP Development | Product Development | Technical Audits for Startups
1 年Now I'll be picturing them debating every time I write with GPT-4!
This limitation can indeed be quite fascinating!
AI Educator | Built a 100K+ AI Community | Talk about AI, Tech, SaaS & Business Growth ( AI | ChatGPT | Career Coach | Marketing Pro)
1 年Love the humor in this explanation!
NSV Mastermind | Enthusiast AI & ML | Architect Solutions AI & ML | AIOps / MLOps / DataOps | Innovator MLOps & DataOps for Web2 & Web3 Startup | NLP Aficionado | Unlocking the Power of AI for a Brighter Future??
1 年Very interesting observation! ??
Developer Relations Associate at Clarifai | Data Science, Machine Learning, LLMs & MLOps
1 年A good read there!