登录查看更多内容

Boosting AI Performance: How a Two-Agent System Outshines OpenAI's Latest Model

Antti Karjalainen

Analyst | Data Analysis, Business Intelligence

发布日期: 2024年9月13日

The latest OpenAI model o1-preview-2024-09-12 just became available. This model aims to improve problem-solving as it has been designed to spend more time thinking before responding. It can solve more complex science, coding, and math problems better according to OpenAI.

Since I have been working with multi-agent frameworks that similarly improve the LLM response quality, I wanted to test this new o1 Preview model against my multi-agent framework that utilizes GPT4o.

Both the o1 Preview and my two agent framework received the same prompt: "What is Claro Analytics?". This article shows how you can add reflection to the multi-agent workflow and get a noticeably better-quality result than using the o1 Preview alone.

The Two-Agent Setup: Breaking Down the Task

I created two specialized agents. Each one has a unique job to help solve complex queries more thoughtfully. The reason why I like customized AI frameworks is that you can easily tailor agents for various roles, divide tasks, and use reflection and collaboration to get superior results.

Agent 1: Decomposer/Planner

The first agent's job is to take a big, complicated question and break it down into smaller, manageable parts. The user can monitor what the agent does exactly as the process goes on. This is a huge benefit as we can audit the agent's thought process.

Agent 2: Solver/Refiner

Once Agent 1 has finished breaking down the question, Agent 2 steps in to solve each part. After that, it reviews the entire solution to make sure it is cohesive and clear. As a user, you can adjust the agent's goal and backstory to suit your needs. You can also provide the agents with tools you think will be needed for a given task.

领英推荐

GPT-4: A Potential Stepping Stone on the Path to…

Data Science Dojo 1 年前

Blind Selection: The Struggle to Objectively Measure AI

Peterson Technology Partners 9 个月前

Exclusive AI Cheat Sheet: Artificial Intelligence…

Cheesecake Labs 3 个月前

Reflection Loop: Getting It Just Right

This framework uses a reflection loop. Once Agent 2 generates the response, it reflects on it, making sure everything is accurate and well-organized. By going through this reflective process, the answer ends up being higher quality and reliable.

How It Outperformed o1

To test this setup, I asked both the two-agent system and the o1 Preview the same question: "What is Claro Analytics?". Just for reference I also asked this question from the GPT4o model and its response was rather short and basic. The response time for GPT4o for this question was 2.85 seconds.

The o1 Preview’s response was much more structured and fairly good compared to GPT4o. However, the quality wasn’t flawless, as there was a clear data error regarding the company that had acquired Claro Analytics. Getting erroneous information from an LLM is a major red flag. The response time for the o1 Preview was 18.9 seconds. Inference time is noticeably longer compared to GPT4o but the quality just wasn't there, unfortunately.

My two-agent system gave the best and most impressive answer. By breaking the task down with Agent 1 and reflecting on the response with Agent 2, the system provided a more coherent and accurate answer, without any errors. It even included information about when the company was founded and also mentioned the founder Michael Beygelman in its response. The response was extensive and better categorized. It offered info on how the platform can be used in different scenarios and included information about the competitive landscape. My model took the longest time to generate a response: 43.5 seconds. However, since this was about reflection and being able to give better answers, speed isn't our primary concern. I will always choose a better answer over a faster inference time.

Conclusion

Using a multi-agent system improves the quality of the AI model's responses by adding a layer of reflection and refinement that can be customized for your needs. Also, it will allow you to use a plethora of other LLMs that might be more cost-effective yet produce equal or better results. In this case, the result of my multi-agent framework for this specific question was better than what I got from the o1 Preview.

Godwin Josh

Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer

5 个月

The o1 model's multi-agent framework presents an intriguing approach to decentralized learning. How does your framework handle agent communication and coordination during the training process? What strategies do you employ to prevent emergent behaviors that might hinder overall task performance?

查看更多评论

要查看或添加评论，请登录

Antti Karjalainen的更多文章

Is the U.S. heading into a recession under The Trump Administration?

2025年3月6日

Is the U.S. heading into a recession under The Trump Administration?

Yes, very much so. Over the last twenty-five years, I have closely observed the economy, and I have never witnessed…

2 条评论
The Future of Programming: A Critical Look at NVIDIA’s Vision

2025年1月10日

The Future of Programming: A Critical Look at NVIDIA’s Vision

NVIDIA’s CEO Jensen Huang, has emphasized that we should move beyond traditional coding, stating, "It is our job to…

5 条评论
Skill Mapping: The Foundation for Skill-Based Hiring

2024年12月11日

Skill Mapping: The Foundation for Skill-Based Hiring

Skill mapping is a powerful tool that goes beyond simply listing requirements. It marks the beginning of the analysis…
Boeing's Hiring Trends: Insights into Strategic Shifts

2024年11月20日

Boeing's Hiring Trends: Insights into Strategic Shifts

Investors are always looking for reliable indicators to predict a company's future. Those indicators can be formed by…
2024 Tech Workforce Update: Layoffs Decline as Top Companies Resume Hiring

2024年11月4日

2024 Tech Workforce Update: Layoffs Decline as Top Companies Resume Hiring

The tech industry had significant layoffs in 2023, as major companies downsized in response to economic uncertainties…
Labor Market Analysis with a Two-Agent AI Framework

2024年10月14日

Labor Market Analysis with a Two-Agent AI Framework

In my previous article (Boosting AI Performance: How a Two-Agent System Outshines OpenAI's Latest Model | LinkedIn) I…
Nuclear Power: A Clean Solution for AI's Energy Demands

2024年9月25日

Nuclear Power: A Clean Solution for AI's Energy Demands

I’m a big fan of nuclear power for a couple of reasons. First, it’s carbon-free, so it does not pollute the environment…

1 条评论
Federal Tax Incentive Proposal for Remote Workers Act

2024年9月12日

Federal Tax Incentive Proposal for Remote Workers Act

This proposal introduces a federal tax incentive designed to encourage the adoption of remote work practices across…
Agentic AI

2024年9月9日

Agentic AI

What is Agentic AI? Agentic AI is a framework where multiple AI agents can work independently to achieve a goal…
California vs. Texas: A Talent and Hiring Landscape Comparison

2024年8月26日

California vs. Texas: A Talent and Hiring Landscape Comparison

Let’s compare the key factors that make these two great states attractive to employers and employees, including job…

See all articles

Boosting AI Performance: How a Two-Agent System Outshines OpenAI's Latest Model

Antti Karjalainen

Analyst | Data Analysis, Business Intelligence

领英推荐

Antti Karjalainen的更多文章

社区洞察

其他会员也浏览了

Mastering MLOps practices for a trading bot

How LLMs are Transforming Bot Building, Botnet Detection at Scale, and Declarative ML for Engineers

o1-preview: OpenAI's New AI Model that can Think & Reason ??

The World This Week in AI (24th December 2024)

LeewayHertz Weekly Digest – Unlocking AI Innovations: From LlamaIndex to AI Pricing Engines

Mysterious GPT is Back...

AI/ML News Digest | 10th edition

How to Spot AI-Generated Content

AI/ML News Digest | 6th edition

??GPT-4o Mini ??Llama 3 405B this week!???AI edu from Andrej Karpathy

领英推荐

Antti Karjalainen的更多文章

Is the U.S. heading into a recession under The Trump Administration?

The Future of Programming: A Critical Look at NVIDIA’s Vision

Skill Mapping: The Foundation for Skill-Based Hiring

Boeing's Hiring Trends: Insights into Strategic Shifts

2024 Tech Workforce Update: Layoffs Decline as Top Companies Resume Hiring

Labor Market Analysis with a Two-Agent AI Framework

Nuclear Power: A Clean Solution for AI's Energy Demands

Federal Tax Incentive Proposal for Remote Workers Act

Agentic AI

California vs. Texas: A Talent and Hiring Landscape Comparison

社区洞察

其他会员也浏览了

Mastering MLOps practices for a trading bot

How LLMs are Transforming Bot Building, Botnet Detection at Scale, and Declarative ML for Engineers

o1-preview: OpenAI's New AI Model that can Think & Reason ??

The World This Week in AI (24th December 2024)

LeewayHertz Weekly Digest – Unlocking AI Innovations: From LlamaIndex to AI Pricing Engines

Mysterious GPT is Back...

AI/ML News Digest | 10th edition

How to Spot AI-Generated Content

AI/ML News Digest | 6th edition

??GPT-4o Mini ??Llama 3 405B this week!???AI edu from Andrej Karpathy