Build your 1st app using Autogen with VSCode and Docker
Step by step guide for beginners, no coding experience required!
Microsoft released Autogen in September 2023. Autogen is an AI Multi-Conversational Framework for Autonomous Agents.
So, why do we need multiple AI agents to collaborate with each other to solve a problem when they are all powered by the same Large Language Model (LLM)?
One might think that we could simply give the same task to ChatGPT, prompting it step by step with human input until we arrive at a solution. However, it's essential to recognize that while LLMs are incredibly capable, they require boundaries. Without proper constraints, LLMs can produce unpredictable and incorrect information. In essence, the current LLMs do not possess human-like reasoning abilities, at least not for the foreseeable future.
This is where multiple conversational autonomous agents come into play. Each AI bot is assigned a specific role and set of responsibilities, making them more specialized in a particular area. These grounding rules allow them to focus on their designated tasks without getting distracted by their "large" language model capabilities. Furthermore, these agents are autonomous, enabling them to work as a team, collaboratively pursuing the final goal.
Autogen has simplified the process of building AI applications and offers the flexibility to load different LLM models for each agent.
Valentina Alto has done a much better introduction on the technology than me! Please follow her and read this blog post.
Now, I am going to show you step by step how we could set this up on Windows with VSCode and Docker to build our first application with Autogen.
You can watch my YouTube video for a quick follow along too.
Requirement:
To run Autogen locally, I am using VSCode on Windows. Install the following components. I have listed the version I am using.
After VSCode is installed, please go to the Extensions and install Python and Docker for VSCode
When you launch VSCode for the first time after Docker Desktop installation, the VSCode will also prompt you to install Windows Subsystem for Linux (WSL). If it didn't, simply install the following command in PowerShell.
When installing Docker Desktop, make sure to select the WSL2.
Run the Docker for the first time after installation.
I am using the default settings.
And you can also use it without creating an account.
The Docker engine needs to be running for our code. You can check that from the Docker icon in system tray.
You can also see the Windows Subsystem for Linux is installed automatically.
When installing Python for Windows, be sure to select "Add Python.exe to Path".
Virtual Environment
First, let's create a new folder on the hard drive, then open the folder inside VSCode. This folder will be our work directory that saves all the files.
Next run "python -m venv name" to created a new virtual environment.
In a couple of seconds, you can see the new virtual environment is created with some subfolders.
Navigate to the "Scripts" folder to activate the virtual environment.
The green name in front of the command line indicates we are in the newly created environment. Next, pip install the Autogen library and Docker.
The Code
I started by creating a blank file and saved in the Scripts folder.
领英推荐
Make sure you also select a Python interpreter by hitting Ctrl+Shift+P and pick the version you just installed. It should be highlighted with the new virtual environment.
Config List
We start by loading the LLM from an environment variable called "OAI_CONFIG_LIST" with "autogen.config_list_from_json" function.
# Load the model
config_list = autogen.config_list_from_json(
env_or_file="OAI_CONFIG_LIST",
filter_dict={
"model": {
"gpt-4",
"gpt-3.5-turbo-16k",
}
}
)
OAI_CONFIG_LIST is a JSON file with as many models as you may need.
[
{
"model": "gpt-4",
"api_key": "sk-APIKey"
},
{
"model": "gpt-3.5-turbo-16k",
"api_key": "sk-APIKey"
},
{
'model': 'gpt-4-32k-0314',
'api_key': '<your Azure OpenAI API key here>',
'api_base': '<your Azure OpenAI API base here>',
'api_type': 'azure',
'api_version': '2023-06-01-preview',
}
]
Then the filter_dict can further control the models available to use within the app.
I have uploaded the OAI_CONFIG_LIST under the root folder so it could be shared by multiple virtual environments.
Config LLM
Next we can configure the LLM.
# Configure LLM
llm_config={
"request_timeout": 600,
"seed": 42,
"config_list": config_list,
"temperature": 0,
}
Since the agents will work with each other, passing information back and forth, you will need to adjust the timeout depending on how complex the task is. You also need to consider how fast the endpoints will respond to the API calls.
Seed means cache. The numeric value can be any number and defaults to 42. The Autogen will always go through cache first to look for answers before making external API calls. Next time when you run Autogen, simply change the seed number to a different value, and the code will ignore the cache.
You can also see below that cache is just a folder created by Autogen. You could delete it if required.
We are passing the config_list (name) to the config_list (variable).
And lastly, temperature 0 is just a grounding method, make sure the LLM stick to the fact.
Construct Agents
With Autogen, we always start with constructing the agents. Imagine you are Nick Fury assemble a team of Avengers!
Autogen starts with minimum two agents.
You, the real human will assign task to the Proxy Agent. Then the Proxy Agent will pass that on to Assistant Agents. The User Proxy can be configured to consult human each step away or work autonomously with the other agents.
# create an AssistantAgent instance named "assistant"
assistant = autogen.AssistantAgent(
name="assistant",
llm_config=llm_config,
)
# create a UserProxyAgent instance named "user_proxy"
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
human_input_mode="TERMINATE",
max_consecutive_auto_reply=10,
is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
code_execution_config={"work_dir": "web"},
llm_config=llm_config,
system_message="""Reply TERMINATE if the task has been solved at full satisfaction.
Otherwise, reply CONTINUE, or the reason why the task is not solved yet."""
)
In the above example, we have defined an assistant agent with no system message (will be using default) and a user proxy agent with "TERMINATE". The user proxy agent won't ask human for feedback until it thinks the job is done and then ask for the final feedback.
It also defined how many round of communication between assistant and human with max_consecutive_auto_reply=10.
All the code is being saved to a new folder called "web"
Assign Task
Now we are ready to use Autogen to do something. In this example, we will ask the Autogen agents to create a simple game. We will need to invoke the "user_proxy.initiate_chat" with the actual task embedded under "message".
You can replace this task with anything in the message.
# Assign task
user_proxy.initiate_chat(
assistant,
message="""
Build a classic space invader game.
""",
)
Once you run the app, you can see the user proxy agent passing human's requirement and the AI assistant working away, breaking down the requirement.
I have also changed the human input mode from "TERMINATE" to "ALWAYS" so I could prompt the bots along the way to fine tune the result.
The first version of the game looked like this, enemies are too large and don't increase over time.
The final version looks like this, with new enemies added every 5 seconds.
Outro
This is just a beginner guide how to get started with Autogen. Obviously, Autogen is a very powerful tool and we didn't even explore the multi-agents (3+) scenario, which is where Autogen truly shine.
In the upcoming blogs, we will set up a group chat with Autogen for more complex tasks.
Advisor Ai & Healthcare for Singapore Government| AI in healthcare | 2x Tedx Speaker #DrGPT
1 年Thanks for Sharing
Senior System Reliability Engineer / Platform Engineer
1 年Taban C.