Reasoning AI - The real Game-Changer behind Large Language Models is not content Generation.
Vlad Larichev
?? Industrial AI Lead @Accenture Industry X | Software Engineer | Keynote Speaker | Industrial GenAI | Passionate about connecting people to drive innovation ??
1. The Overlooked Capabilities of Large Language Models
When I talk to customers, colleagues and experts about Generative AI and Large Language Models (LLM), I find that there are generally three groups:
1) The skeptical group: They talk about the upcoming AI winter and describe Generative AI as just "another AI hype cycle".
2) The enthusiastic admirers: These individuals are impressed by the remarkable abilities of these models in generating content, such as code, text, and video.
3) Convinced supporters: This group sees LLMs and the broader suite of technologies we now call "Generative AI" as having a real transformative power on our society and industries.
However, I have realized that (as members of this third group), we have not yet succeeded in communicating exactly what this potential entails, as our discussions rarely center on content generation but rather on other capabilities of LLMs.
Reflecting on these discussions, I understood that there is a crucial aspect of LLMs and multimodal models which is not captured by the term "Generative AI" well, which I would like to discuss with you in this article.
There is a crucial aspect of LLMs and multimodal models which is not captured by the term "Generative AI" well.
This aspect is a key driver for the future development of AI and crucial to understanding our enthusiasm for this technology - the ability of large language models to reason.
We have essentially taught our machines to think and to engage in logical processing. Let me show you how and what effects it has.
2. World Model vs. Stochastic Parrots
Some skeptics refer to LLMs as "stochastic parrots", critiquing them for just predicting the next token rather than understanding context or meaning.
However, other group is convinced that trough long learning on large amount of material, LLMs are learning connections between different topics in our world, forming what we call a "world model".
With the right setup and guidance, they can use this world model to not only respond to inputs, but also predict and evaluate potential outcomes and scenarios, allowing for a deeper understanding than simply predicting tokens. But to use these skills, we need special techniques, and we need to help LLMs to think in a structured way.
3. How we can help LLMs to organize their thinking process.
At the moment, there are no build-in mechanisms for advanced reasoning in our LLM tools like Chat-GPT, but we can create them with simple commands / prompts, helping LLMs to organize the process of thinking.
Some of prominent approaches, to help LLM to organize their thinking process and enhance the reasoning capabilities are Chain-of-Thought Prompting and the REACT Framework, helping large language models to self-reflect, reason better and generate more accurate answers.
?? Chain of Thoughts: This method encourages the model to think step-by-step, similar to how humans solve problems. Instead of jumping straight to an answer, the model breaks down the question into smaller parts and solves each one sequentially. This makes the reasoning process more transparent and often more accurate because it mimics logical human thinking.
??REACT Framework: This framework is designed to improve how models handle complex tasks by using a structured approach. REACT stands for Retrieve, Extract, Adapt, Construct, and Think. It guides the model to first gather information, then pick out the important parts, adapt that information to the specific question, build an answer using the information, and finally, review the solution to ensure it makes sense. This structured approach helps the model be more systematic and thorough in its reasoning.
Both Chain of Thoughts and REACT are examples on how we can enhance the way AI models think and solve problems, by breaking down problems and building up answers, which leads to more reliable and understandable outputs, and give our bots (let's call them "Agents" further) a framework to use tools, access data and take action.
?? But as in our daily lives, we need a team for complex projects. Here too, forming a team of agents is becoming the best practice for solving complex problems - let's see how.
4. Introduction to Agent-Based Architecture and AI Teams.
Understanding agent-based architecture is straightforward when compared to managing complex projects in the real world. Typically, a project manager is not an expert in every subject but instead creates a task list, tracks the progress and delegates each task to the appropriate experts.
To understand, why this can't be done efficiently by one LLM, we firstly need to understand how our dialog with LLM works. Currently, LLMs can only maintain a limited amount of information in their immediate context window (similar to the short-term memory ??), and today, this might encompass millions of words or tokens. However, each interaction or query generates new tokens, rapidly filling up and mixing the available context.
What we experience as a dialogue with an LLM is, in reality, the model passing the entire conversation back and forth, maintaining the illusion of a continuous discussion.
领英推荐
What we experience as a dialogue with an LLM is, in reality, the model passing the entire conversation back and forth, maintaining the illusion of a continuous discussion.
Each time you respond, the AI must re-evaluate the entire conversation plus your latest input, risking the loss of subtle nuances and the main goal of the session, which leads to a snowball effect and rapidly increases the context.
Furthermore, if an LLM needs to perform a web search, the volume of parsed text and links might be overwhelming and not necessarily relevant to the task at hand. The concept of learning to "forget" for LLMs is promising but is still in the research phase (Selective Forgetting Can Help AI Learn Better).
In this context, focus becomes critical—more efficient systems should allow the "Manager AI" to concentrate solely on the main goal, delegating subtasks to smaller, specialized agents characterized by three attributes: access to specific data, specialized tools, and optimization for task complexity.
Similarly, managing a large and complex project effectively often involves breaking it down into smaller, manageable goals, assigning these to specialists, and then coordinating and reviewing the results to ensure alignment with the overall objective.
This is precisely how agent-based architectures function in LLMs.
Larger models are not necessarily the solution to all problems: while they may enhance general reasoning capabilities, they also increase the proportion of irrelevant data, slow down response times, and incur higher costs. For simpler tasks, like text summarization or data conversion, a less powerful LLM might suffice. The best use for a powerful LLM might be in tasks that require complex reasoning and self-reflection.
5. Reasoning AI as the Next Evolutionary Step
LLMs don't need to be perfect, they just need to be good at reasoning and have access to right tools, agents and data.
Studies show that efficient agent-based systems can be more precise than larger, more powerful LLMs - simply via a sampling-and-voting method, the performance of large language models scales with the number of agents, stagnating after 10.
Imagine an expert who must always provide the correct answer versus a team of researchers with unlimited internet access and resources to develop hypotheses, test them, and refine their approaches until they find the correct answers. This comparison illustrates the potential and power of the "Reasoning AI", how I will call it further, since there is no official naming for this capability yet.
A team of intelligent and specialized AI colleagues working together, breaking down tasks into small pieces like humans do, with access to data, working around the clock and with a speed of thought that far surpasses humans, leaves many experts wondering which tasks CANNOT be supported by these solutions.
And the best thing is that the basis and frameworks for these solutions already exist - services like AWS Bedrock already offering everything, what you would need to build a solution like this. We also already using this approach to enhance processes in R&D, Engineering, Manufacturing, our work with requirements and new product development.
If you want to see a practical example, you can take a look at my post from over a year ago, where such a system organizes and delegates tasks between agents: Link to the post.
Also, there are great further frameworks that are already available for you to get started and learn this new approach and way of thinking:
6. Conclusion and outlook
I hope you now agree with me that "Generative AI" absolutely does not refer to these capabilities, which are perhaps even more transformational than "just" content generation, and I would be happy if you take #ReasoningAI as a topic in your next internal discussions and talk to you about how this is a next step in the development of AI.
Let's move the Reasoning AI topic forward together and if you enjoyed this article, I would love to discuss it with you further.
If this topic resonates with you, I plan to write a series of articles about the implementation of "Reasoning AI" on AWS Bedrock, how this approach might change the way we approach problems in software development, and real use cases of Reasoning AI implementation in Engineering and Manufacturing!
#ReasoningAI, #LLM, #Engineering, #Bedrock
Excited driving this topic further with Jonathan Tipper Alexander Herttrich Kathrin Schwan Felix Klemm Liam Friel Pankaj Sodhi and many others.
☆ Helping businesses grow through Digital Transformation. ☆ Certified Technology Advisory Partner.
1 周https://machinelearning.apple.com/research/gsm-symbolic
Industrial AI Consultant, Moderator and Podcaster
5 个月Mmmhhh... Reasoning is a loaded term, which you will get a lot of pushback for... I believe in combining subsymolic and symbolic approaches: provide LLM's with models of the world around them (rather than having them forming a world model), more specifically knowledge graph based LLM's that behave as agents helping us do our job by interacting with other agents.
AI/ML Engineer . NLP engineer . Conversational AI Engineer
5 个月This touches upon an interesting point and highlights our own incorrect expectations from the large language models. Our idea that by merely fine-tuning or by providing some context through RAG we encode reasoning into the model is false. Reasoning about the context and then generating seems like the step wise process we should move towards
Principal Director at Accenture
6 个月Its been nearly a year since I published this video on the reasoning capabilities of Open AI GPT4 Code Interpreter (their first python agentic approach to simple programming based analysis of data files provided alongside a prompt). Since then we've seen multi-modality, better models and advanced prompting frameworks to improve reasoning capabilities. I agree with your conclusion though. Multi-agent systems are the future. Whether it solves reasoning on its own, I'm unsure. I think it solves complex workflows and other single model Q&A limitations though. Would love your view on what re-making this video now would look like though! 1 year in the world of AI feels like a decade. https://youtu.be/jZDmz2jN7Ws?si=oIoeOJjSKCnCvasb
Principal Director @ Accenture | Data & AI
6 个月Good start. I agree it is worth investigating further.