Next level Gen AI: From ChatGPT to Multi-agent workflows
One of the godfathers of AI, Andrew Ng , recently released an open source implementation of a sample application for agentic workflow. It's a "weekend project" not meant to be used as production code; nevertheless it effectively demonstrates a major shift that's coming for Generative AI applications.
The example here translates text from one language to another, optionally tailored for country-specific usage. The input text is broken up into multiple chunks if it is too long. The system makes it easy to modify the output style, use idiomatic terminology and apply regional dialects.
Unlike typical ChatGPT-like usage, where you would simply ask the LLM for a translation and receive the generated output, this approach uses a different architecture: here we have three cooperating LLM agents with different roles (and correspondingly different prompts) working together in a multi-step fashion to generate a higher-quality translation, as follows:
This is a simple example, but it represents a more sophisticated way to use LLMs than simply "calling ChatGPT". By assigning different roles to various agents and getting them to cooperate, you are basically mimicking a team of humans working together to achieve a common goal.
Why is this important? As Andrew Ng says, no one writes an essay or a story by simply putting down one word after another. The way humans work in real life looks more like this:
The reason we do it this way is because it generates much better output: using multiple passes coupled with reflection and editing allows us to separate the different tasks. First we pay attention(!) to the larger thread of the document, the cadence and flow of words, and only then do we focus on optimizing the individual phrases, words and terminology.
领英推荐
Early applications of GPT and other LLM models have so far focused on straightforward chat completions: the user provides some question or statement, and the LLM responds. Those responses are already startlingly effective, especially when integrated with buffer memory and RAG recall; the technology is catching on like wildfire.
But that is only scratching the surface. As we move into more sophisticated usage, building agents that can reason independently, communicate together and cooperate to achieve a common goal, these applications will increasingly start to act autonomously on behalf of the user. Beyond simply asking the LLM a question, users will instead be able to assign a complex task to an agent; this agent in turn will work with other agents acting in various roles, to fulfill that task and return with a completed solution for the user.
The example code base above is a concrete implementation of this more complex approach.
As always in technology, this is just the beginning ...
For the Gen AI software geeks, some observations about the code: