登录查看更多内容

Mixture of Agents (II)

Sparkbit

Large-scale production experience merged with PhD-level analytical grounding to deliver measurably-high business impact.

发布日期: 2024年7月11日

Two weeks ago, we started discussing the Mixture of Agents. We tackled the systems of thinking and how we'd like to translate them into LLM operations.

We've also highlighted the Mixture of Experts approach as the prelude to how we want to achieve long-inference models.

If you're not familiar with the topic, here's part one.

Today's chapter will be focused on the actual Mixture of Agents approach.

Without further ado.

What is Mixture of Agents

A good way to use LLM as a productivity booster is to treat it like a rookie assistant.

A rookie with broad knowledge but unable to channel it if not asked specifically. Eager to do, but without supervision, you have no idea what quality the outcome will be. LLM, on the other hand, will be very confident that he did the best possible job.

That's why prompt engineering is "the thing," as it's our main source of control and guidance over what the model produces.

If you want the output to be good, you must iterate with the LLM.

However, we seek the model autonomy to perform the task by completing iterations according to a plan and knowing the goal and what was done earlier.

The approach that chains models, making them collaborate as a team and use the collective expertise of different LLMs, is called the Mixture of Agents.

How Mixture of Agents works

As you see in the illustration, we have a framework that goes as follows:

An input prompt is fed into the first layer - the proposer
The first layer consists of three models (A1,1; A1,2; A1,3)
Each of the models produces an output and feeds it forward to the next layer along with the initial prompt
The process repeats on layers two and three - also proposers
In the final layer - the aggregator - we've got one LLM that aggregates what has been done and produces a final output

Each agent on each layer receives the same input to answer and additional information (let's call it more context) produced by the previous layer.

The actual inspiration for this is a Mixture of Experts method highlighted in the previous chapter, but here, the experts within a model were replaced by full-fledged LLMs.

The paper that introduced the MoA approach presents how it enhances the output quality in the following illustration:

Additionally, MoA scored 65.1% on AlpacaEval 2.0 using only open-source LLMs, compared to GPT-4 Omni’s 57.5%.

Now, the first thought that pops up is, "alright, the performance is slightly better, but we have to run multiple LLMs to get the output." That's true, but benchmarks show it's actually more cost-efficient than frontier models, and the quality-to-cost ratio is much better.

Orchestrating agentic systems

Okay, so that was the theoretical part. Let's review a quick example of how such a system can be built.

First, what do we want it to do?

Complete a goal.

领英推荐

What is HtmlRAG, Multimodal RAG and Agentic RAG?

TuringPost 1 个月前

Kontent.ai Digest: July 2023

Kontent.ai 1 年前

Elevating Machine Learning Practices Towards Excellence

Tridiagonal Solutions 11 个月前

What does it need to complete a goal?

Agents that have the following qualities:

- a profile or a purpose that is well-defined so the other agents can call it whenever necessary

- a memory of passed-over messages and the collective memory of the goal and progress made toward it

- an ability to plan both the steps and the approach to achieve them

- toolkit and a set of actions that can be done with it

How can we execute it?

We can use one of the two most common patterns:

- centralized, where a "supervisor LLM" decides on steps and delegates tasks and tools based on the analysis of progress

- decentralized, where each agent can decide what to do, when to call other agents and when to end its part of the execution

For each pattern, three core elements - training, controller, and critic - most affect system performance.

Centralized pattern approach

Centralized Training involves training all agents using a shared set of data and a central controller that has access to all agents' global state and actions.

Centralized Controller is responsible for making decisions and distributing tasks to the agents. It collects information from all agents, processes it, and then issues commands.

Centralized Critic is a global value function that evaluates the overall system performance. The centralized critic updates the agents’ policies based on the joint actions and observations, leading to a coherent strategy across agents

Decentralized pattern approach

Decentralized Training occurs when each agent is trained independently with its local data and observations. Agents do not have access to the global state but rely on their perceptions and interactions with the environment. This setup is more scalable and can effectively handle dynamic and large-scale environments.

Decentralized Decision-Making is when each agent makes decisions autonomously based on its local information and objectives. This method reduces the communication overhead and increases the system's robustness since there is no single point of failure.

Decentralized Critic means that each agent has its own critic that evaluates its performance. This allows for a more flexible and adaptable system where agents can learn and adapt individually to their local conditions.

Final thoughts

The Mixture of Agents approach is very promising, and will definitely be adopted by many organizations.

How I like to think about them is that they will govern other AI systems, forming an ecosystem. As AI is much more than GenAI, various agents sitting on top of the data governance platform will aid upper-level management in:

- aggregating and redistributing data coming from various sources,

- feeding other parts of the system (e.g., Computer Vision systems),

- confronting the data with predictions (e.g., predictive maintenance systems connected with AI Quality Control systems and IoT readings),

- or decomposing data silos.

For more ML and AI insights, subscribe or follow Sparkbit on LinkedIn.

If you're looking to start an AI project, you can book a free consultation with our CTO here: https://calendly.com/jedrek_sparkbit/ai-consultation

Author: Kornel Kania , AI Delivery Consultant at Sparkbit

Smart Moves

455 位关注者

Jered Blumenfeld

Ethical AI Enthusiast & Advisor | Product Manager | EMT

6 个月

Awesome read!

1 次回应

查看更多评论

要查看或添加评论，请登录

Sparkbit的更多文章

See all articles

Mixture of Agents (II)

Sparkbit

Large-scale production experience merged with PhD-level analytical grounding to deliver measurably-high business impact.

What is Mixture of Agents

How Mixture of Agents works

Orchestrating agentic systems

领英推荐

Centralized pattern approach

Decentralized pattern approach

Final thoughts

Smart Moves

455 位关注者

Sparkbit的更多文章

社区洞察

其他会员也浏览了

The Flow Report | Edition 009

GPT-4 is Getting Faster ??

Dr GPT or: How I Learned to Stop Worrying and Love the AI Part 4

Artificial Intelligence #245

Artificial Intelligence #245

Watch#3: Literate LLMs, Human Errors and Chains-of-Verification

Artificial Intelligence #235

Artificial Intelligence #224

Artificial Intelligence #224

Artificial Intelligence #225

What is Mixture of Agents

How Mixture of Agents works

Orchestrating agentic systems

领英推荐

Centralized pattern approach

Decentralized pattern approach

Final thoughts

Smart Moves

455 位关注者

Sparkbit的更多文章

Buy smart or build strong? The hard truths about RAG

So you want to start AI transformation in your company? Read this first ->

Large Reasoning Models (II) OpenAI o1 and DeepSeek debunking

Large Reasoning Models (I) - a technical overview

AI Recap - December 2024

AI in 2024 - predictions vs reality

RAGOps - a blueprint for production-grade AI systems (pt. I)

AI - Transformation Engine or Problem Solver

AI Recap - November 2024

The "Retrieval" is what makes the RAG system production-ready. Here's how to work it out.

社区洞察

其他会员也浏览了

The Flow Report | Edition 009

GPT-4 is Getting Faster ??

Dr GPT or: How I Learned to Stop Worrying and Love the AI Part 4

Artificial Intelligence #245

Artificial Intelligence #245

Watch#3: Literate LLMs, Human Errors and Chains-of-Verification

Artificial Intelligence #235

Artificial Intelligence #224

Artificial Intelligence #224

Artificial Intelligence #225