登录查看更多内容

Plan and Execute Agents with Chain of Reasoning: An Improved Approach to Agentic Systems

Ajith Aravind

GenAI Solution Architect: React | Python | Langchain | Javascript | Node.js | Blockchain | Generate exponential value using NextGen technologies

发布日期: 2024年8月12日

+ 关注

#langchain #openai #anthropic #langgraph #agents #agenticsystems

In a previous post, we looked at a simple agentic system. For a quick refresher, check out this LinkedIn post.

Even though this agent works for simple use cases, it has serious limitations:

We're letting the Large Language Model (LLM) do all the reasoning to come up with a plan.
The LLM is also in control of how this plan should be executed.
There's no human intervention along the process; once the agent is fired up, we wait for the end result.

So, how could we address these limitations? Let's tackle them one step at a time.

The Reasoning Debate

Let's start with point 1 - can LLMs really reason and plan? This is a hotly debated topic with two schools of thought:

One camp argues that LLMs aren't really reasoning, they're just regurgitating from memory in a way that appears to be a plan.
The second camp believes LLMs can indeed reason.

Honestly, I don't know the definitive answer. But for our purposes, we don't need to worry about it as long as we get a solid plan, whether it's through reasoning or not.

Taking Control with LangGraph

The second part is easier to address - we as developers can take the driver's seat and control the execution of the plan to a great extent. This is where LangGraph comes in. LangGraph is a library that allows us to engineer our process flow, which is super handy for creating and managing complex workflows in language models. It gives us fine-grained control over how our agent operates.

Enabling Human Intervention

Now onto the third point - human intervention. LangGraph helps us intervene before or after the execution of a node (think of a node as a function in your code). This means we can check and adjust the process at various stages, ensuring the agent stays on track.

The Challenge of Planning

If you've dabbled with agentic systems, you've probably noticed that planning is much more difficult than execution. Let me explain with an example:

Let's say I want an agentic system to come up with a plan to buy a lens for my camera. At a high level, this involves creating a list of steps (let's call it the Plan) and then executing each step until we get the final outcome.

Here's where it gets tricky. The LLM has to factor in a lot of things:

The steps need to be in a logical, sequential order (e.g., research reviews before finalizing a lens).
It needs to understand the context of what I want to do with my new lens (Am I planning to use this lens handheld or on a tripod?).
And many more factors...

It's not as easy as it might sound at first. We can't simply assume that the LLM will magically come up with a plan that's perfectly personalized for me.

The Solution: A Two-Pronged Approach

If you've read this far, two things should be clear:

We need a mechanism for the LLM to come up with a solid plan.
We also need a way to interact with the LLM to give our inputs along the way.

I've put this approach into action, and I'm super impressed with the results. The plans it generates are so comprehensive that even as a photographer myself, I wouldn't have considered all those aspects if someone had asked me the same question

How It Works: The Plan Generation Process

The approach to generating the Plan is highly inspired by Professor Synapse's method. It's a novel way of prompting that you can check out here.

With this approach, we're essentially forcing the LLM to engage in "System 2 thinking" - slow, deliberate, and conscious. We design our prompts to make the LLM do things it might not do otherwise.

In this example, I asked the LLM to generate the following outputs:

领英推荐

Let’s Talk Tech #14: AI & Fake News, Domestic…

Orange Business 1 个月前

#What're we reading :Yuval Noah Harari argues that AI…

Mezorn LLC 1 年前

The Origins of Language

LearnningTree 2 年前

Main Objective
User Preferences
Research Focus
Potential Outcomes
Strategy Adjustments
Required Expertise
High Level Strategy
The Plan

Here's the kicker: I don't actually use any of these outputs except the Plan. But by forcing the LLM to generate all of this additional information, I end up with a much better plan. It's as if the process of considering all these factors leads to a more thorough and well-thought-out plan.

We can then iterate over this plan with the user and only proceed to execution if the user approves the final plan.

A Neat Trick: Complement Plan with Execution approach

Here's another clever technique we can use: along with the plan, we ask the LLM to come up with an approach for executing each step.

For example, let's say one step in our plan is "Find whether the shortlisted lenses have weather sealing". The execution approach for this step might be "Do a Google search with keywords 'Weather sealing' or 'extreme weather handling' in reputable photography forums only".

Is this necessary? Not always. But remember, wherever we can intervene and take control from the LLM, it's generally a good thing. These 'approaches' supporting the steps in a plan will be useful for another LLM that's going to execute these steps down the line. This second LLM doesn't have to worry about coming up with search queries - it's already been done.

This setup is particularly useful when we want to use a powerful LLM like GPT-4o or Claude Sonnet to come up with the Plan, and then let a less powerful (and less expensive) model like GPT-4-mini execute the plan. As a bonus, this approach can also help reduce costs.

User Intervention: Making It Personal

Now, let's talk about user intervention. The session starts with a requirement gathering process that continues in a loop until the user types "quit" or "exit". This is incredibly valuable as it gives the LLM lots of context about what exactly the user is trying to achieve.

There's another aspect of human intervention too - during plan execution, there could be steps that need human input. In our lens buying example, a critical input might be "Do you want a third-party tripod collar?" This kind of specific question might not have been covered during the initial requirement gathering process.

Key technical elements

To implement this Chain of Reasoning approach with human intervention, we use the following technical components:

LangGraph: We create a 4-node graph for:

Requirement gathering
Creating the plan
Executing the plan
Generating the final output This structure allows us to control the flow of our agentic system, enabling the step-by-step process we described earlier.

Create React Agent: We use create react agent for execution of each step in our plan. Create React Agent is a pre built agent that allows us to create autonomous agents ideal for executing steps that involve web searches or other tools.

FastAPI Websocket: This handles communication between the front end and backend, facilitating real-time human intervention. When the system needs user input, it can send a request through the websocket, and receive the user's response in real-time, allowing for seamless integration of human intelligence into the process.

React Front End: Our user interface is built with React, providing a responsive and interactive experience for users to input requirements, review plans, and provide necessary interventions throughout the process.

Wrapping Up

Based on my experiments with different Plan and Execute methods, I've gotten very good results with this Chain of Reasoning approach combined with human intervention. I hope this motivates you to try it out yourself!

Remember, the key take aways are:

A structured planning process that forces the LLM to consider multiple factors
Use a powerful LLM to do the planning and a less powerful LLM to execute the plan
Assist the less powerful LLM in execution by generating execution approach for each step
Multiple points for human intervention and feedback

By combining these elements, we can create agentic systems that are more robust, more personalized, and ultimately more useful. Happy experimenting!

要查看或添加评论，请登录

Ajith Aravind的更多文章

How DeepSeek R1's "Thinking" Can Elevate Smaller Models

2025年2月16日

How DeepSeek R1's "Thinking" Can Elevate Smaller Models

The buzz around DeepSeek R1 is undeniable, and for good reason. While its performance is certainly impressive, it was…
Best small model - Mistral Small 3

2025年2月4日

Best small model - Mistral Small 3

Mistral AI OpenRouter Mistral Small 3 must be the best small model that provides quality output, support structured…
Building an Army of Software Engineers with Vanilla LLM Function Calling

2024年11月24日

Building an Army of Software Engineers with Vanilla LLM Function Calling

#AIEngineering #LLM #FunctionCalling #OpenAI #SoftwareEngineering #AI #TechInnovation #DeveloperTools #langgraph…

3 条评论
Super Excited about ColPali and Byaldi: The Best RAG Technique I've Tried for Complex PDFs

2024年9月13日

Super Excited about ColPali and Byaldi: The Best RAG Technique I've Tried for Complex PDFs

If you ever work with complex PDF documents, one of the most challenging aspects has been how to retrieve relevant…

1 条评论
The Transformer's Complete Love Story: Finding the Perfect Match

2024年9月1日

The Transformer's Complete Love Story: Finding the Perfect Match

Once upon a time, in the land of Language, there lived a charming quartet: The, Cat, Sits, and On. They were a…
Plan and Execute Agents pack a punch!

2024年8月19日

Plan and Execute Agents pack a punch!

We have seen Plan and Execute Agent in action in the last post. Quite amazing that such an approach flow engineered…
Visual LangGraph Generator

2024年8月17日

Visual LangGraph Generator

A fun little project to visually generate graph boilerplate, inspired by langgraph-engineer. The key difference? We're…
Agentic Systems

2024年7月20日

Agentic Systems

LangChain Developer OpenAI #langgraph #agents #agenticsystems Let’s talk about agents today. Hardly a day goes by…

2 条评论
Beyond Basics: Evaluating LLMs – Uncovering the Truth (Part 2: Advanced RAG Techniques)

2024年4月30日

Beyond Basics: Evaluating LLMs – Uncovering the Truth (Part 2: Advanced RAG Techniques)

In part 1 of the blog post (url), we looked at how to create a simple RAG chain. We asked some questions and got some…
Beyond Basics: Exploring Advanced RAG Techniques

2024年4月28日

Beyond Basics: Exploring Advanced RAG Techniques

Ajith Aravind , EcoWiz Are you ready to elevate your RAG (Retrieval-Augmented Generation) applications? This blog…

See all articles

Plan and Execute Agents with Chain of Reasoning: An Improved Approach to Agentic Systems

Ajith Aravind

GenAI Solution Architect: React | Python | Langchain | Javascript | Node.js | Blockchain | Generate exponential value using NextGen technologies

The Reasoning Debate

Taking Control with LangGraph

Enabling Human Intervention

The Challenge of Planning

The Solution: A Two-Pronged Approach

How It Works: The Plan Generation Process

领英推荐

A Neat Trick: Complement Plan with Execution approach

User Intervention: Making It Personal

Key technical elements

Wrapping Up

Ajith Aravind的更多文章

社区洞察

其他会员也浏览了

The Quantum Linguist #020

Erasure of Language Memory (ELM)

Numerator

Best of All Possible AIs, Jane Austens & Privacy Ally Update

Mix tape books - combining literature and science fiction genres to attract readers to books - GPT publishing

AI is THE New Language

INFORMATION TECHNOLOGY AND DATA COLLECTIONS BOTH VIRTUAL AND PHYSICAL (part 1 of 4)

Where do we go from here?

My Exploration in Getting DeepSeek and o3-mini to Solve Sudoku

Evolution of Humanity's Codex: Language, Math, and now AI

The Reasoning Debate

Taking Control with LangGraph

Enabling Human Intervention

The Challenge of Planning

The Solution: A Two-Pronged Approach

How It Works: The Plan Generation Process

领英推荐

A Neat Trick: Complement Plan with Execution approach

User Intervention: Making It Personal

Key technical elements

Wrapping Up

Ajith Aravind的更多文章

How DeepSeek R1's "Thinking" Can Elevate Smaller Models

Best small model - Mistral Small 3

Building an Army of Software Engineers with Vanilla LLM Function Calling

Super Excited about ColPali and Byaldi: The Best RAG Technique I've Tried for Complex PDFs

The Transformer's Complete Love Story: Finding the Perfect Match

Plan and Execute Agents pack a punch!

Visual LangGraph Generator

Agentic Systems

Beyond Basics: Evaluating LLMs – Uncovering the Truth (Part 2: Advanced RAG Techniques)

Beyond Basics: Exploring Advanced RAG Techniques

社区洞察

其他会员也浏览了

The Quantum Linguist #020

Erasure of Language Memory (ELM)

Numerator

Best of All Possible AIs, Jane Austens & Privacy Ally Update

Mix tape books - combining literature and science fiction genres to attract readers to books - GPT publishing

AI is THE New Language

INFORMATION TECHNOLOGY AND DATA COLLECTIONS BOTH VIRTUAL AND PHYSICAL (part 1 of 4)

Where do we go from here?

My Exploration in Getting DeepSeek and o3-mini to Solve Sudoku

Evolution of Humanity's Codex: Language, Math, and now AI