How to Agentify Wisely

How to Agentify Wisely

By: Babak Hodjat

I recently used a multi-agent system to help me build a tool to point at GitHub repositories to agentify them. I used AutoGen's AutoBuild, but there are any number of such multi-agent frameworks, mostly tuned to help with software development projects, that I could have used. I gave the system the following prompt:

> I would like to create an OpenAI assistant that would be pointed at a public GitHub repository and create an OpenAI assistant to represent every python class in that repository. The code should have unit tests and we should make sure they all pass. Make sure the code is reviewed before execution. Check to make sure the full software package is completed and working before terminating.?        

It then generated a team of agents to achieve this. All the agents use the same LLM (GPT-4o) as the wrapper, but they each have a different LLM-generated system prompt defining their roles:

> ==> Generating agents...
['repository_analyzer', 'python_programmer', 'unit_test_engineer', 'code_reviewer', 'quality_assurance_specialist'] are generated.        

I then tasked this team of agents with the following:

> I would like to create an OpenAI assistant that would be pointed at a public GitHub repository, and that would in turn create an OpenAI assistant to represent every python class in that repository. Each such assistant would be able to call on other assistants representing classes being used by the class it represents. In other words, I am building an agentification assistant, which builds a multi-agent system on top of an existing code base. Write this for me using the latest version of OpenAI Assistants API. Make sure the code runs and completes all tasks successfully. No simulation. Add unit tests and make sure they all pass with reasonable code coverage. Make sure there is an assistant that reviews each class and generates the relevant system prompts for the assistant that will represent that class. Make sure the code does not have any syntax or runtime errors. Do not terminate until the full software package is completed.        

?The resulting code, while impressive, required a lot of tweaking and refactoring by me to make it work. Also, once I got it to work, I realized that, like the multi-agent frameworks I used to build it, my goal was overambitious and impractical.

Maybe, with better LLM models, a lot of the manual tweaking would become unnecessary. That is not the main reason I think the approach I took was impractical. Unlike many of my peers in the AI world, I think a fully autonomous agent-based goal, while noble, is, in most cases, not practical due to technology and social considerations.

To start with, there currently can't be a one-to-one correspondence between objects in software and AI agents. That is a processing capacity and cost overkill, slows things down, and with languages like python, the object orientation is already not complete and leaves much to be desired. The granularity of software and app units to be agentified, therefore, is an important consideration, which I glazed over naively in my project.

We should therefore start from the top and consider an organization and work down from there, deciding what can be agentified. Agentifying an organization implies outlining the roles and communications of the various players in various organizational workflows, and then exploring how they could be augmented with LLM-based agents.

There's a subtle difference between this approach and completely rethinking an organization's structure and processes to make it entirely autonomous and as efficient as possible. While that is a noble goal, it is impractical, not the least because it is risky and too big a jump from the status quo. Such a jump is likely not going to be easily absorbed by enterprises.

We should be mindful of the fact that agentification is happening on an organization that is already operational, so an incremental easing in of agentification is critical. The human-based organization will have to endorse, trust, and adopt agents in their day to day, and if we've done our job well, the organization should gradually become more efficient, the employees would be happier, and the impact can be measured in the same KPIs as the ones by which the organization is measured.

At the most simplistic level, we can augment each human operator in the organization with a corresponding agent which is given the roles and personas analogous to the job descriptions of the human operators currently in charge of playing the same role.

Furthermore, there must be an elaborate process for how to decide between autonomously processing a task versus including a human in the loop.

Here is a crude outline of how such a process could work:

1. Each agent's system prompt should clearly define the responsibilities and non-responsibilities of the agent, delineating what the agent can do autonomously, and what needs to be delegated or approved by humans, and such an approval/delegation should be defined and facilitated.

2. Each agent should be paired with a moderator agent that checks its input, thought process, and output, for ethics or governance issues, and if this safeguard agent detects anything that requires human intervention, it should be able to prevent autonomous processing and should force oversight by humans.

3. The point-uncertainty for an agent's operation should be assessed by an uncertainty model, trained on the LLM used by the agent. This model will be able to raise a flag if its corresponding agent is operating on input it has rarely seen before, or if it is producing behavior that is unfamiliar [see: https://arxiv.org/abs/2405.13845 ].

4. Finally, I am a proponent of incorporating a "Disengage Button" into all agent-based systems. In cases of emergencies or malfunctioning, human overseers of an agent-based organization should be able to disengage all agents and stop autonomous operations, having the system fallback to an entirely manual or fully predictable mode of operation (e.g., governed by reliable and consistent machinery and rules).

5. If none of the cases above apply, then the system should be allowed to operate autonomously.


The future of agent-based enterprises seems inevitable, and the potential advantages are myriad and compelling. It pays, however, to apply a wholistic approach and perspective to this migration rather than accept it as it comes and suffer the pitfalls that also seem inevitable.

?

Thank you, Babak, for the practical tips on chooosing appropriate use cases for GenAI agent assistants..

回复
Axel Badalian

VC Senior Associate @Alpha Intelligence Capital

4 个月

Great article Babak Hodjat!

回复
Diane Gutiw PhD

Vice President CGI Global AI Research Lead

4 个月

Well said and excellent insights!

回复
Vijaya Kumar Gutta

Sr Solutions Architect - Cloud, Gen AI Applications

4 个月

Well documented Babak Hodjat. Ensuring we have human-in-the-loop capabilities for approving critical steps, proper guardrails, and a kill switch when things go wrong are essential for any agentic systems.

Balaji Arivazhagan

AIML Architect @ Quantique Metadata | Responsible AI, Multimodal AI

4 个月

Turning failure into manure is unique leadership. Kudo's ??????

回复

要查看或添加评论,请登录

Babak Hodjat的更多文章

  • How to Build a Conscious AI Agent

    How to Build a Conscious AI Agent

    I had declared my FedEx Day project as building a GPT that could be called using OpenAI’s new GPT calling mechanism…

    19 条评论

社区洞察

其他会员也浏览了