登录查看更多内容

How to Build an AI Agent Without Using Any Libraries: A Step-by-Step Guide

Zahiruddin Tavargere

Senior Principal Software Engineer@Dell | Opinions are my own

发布日期: 2024年8月19日

+ 关注

Video Tutorial

If you prefer video tutorials, below is the video version.

The Adaptive Engineer - How to build AI Agent

The Context

Try asking these questions to ChatGPT or Claude and you will see a response that looks like below.

We know LLMs cannot access real-time information - so let's use this as a problem statement for our basic AI Agent.

Before we begin - let's understand the fundamental building blocks of an agent.

The Adaptive Engineer - Primary components of an Agent

An agent essentially comprises three primary components - the Models, the Tool, and the Reasoning Loop.

The model refers to the Large Language Model (LLM) at the core of the AI agent. This is the foundation of the agent's intelligence and capabilities. The LLM is trained on vast amounts of text data, allowing it to understand and generate human-like text. It provides the agent with knowledge, language understanding, and the ability to process complex instructions.

Tools are specific functions or capabilities that the AI agent can use to interact with its environment or perform tasks

The reasoning loop is the iterative process that allows the agent to make decisions, solve problems, and accomplish tasks.

In the context of our use case, the following is how the 3 components would work together.

In this tutorial, we'll create a more advanced LLM agent that uses the concept of "tools" to perform various tasks. This approach allows for greater flexibility and easier expansion of the agent's capabilities.

Prerequisites

Basic Python programming knowledge
Understanding of LLMs and their capabilities
Python 3.7 or higher

Implementing the Tool-based Agent

Let's start by creating the tool interface. This will allow us to build tools that have to implement the required methods

领英推荐

What Are The Most In-Demand AI Skills?

Bernard Marr 2 年前

Latest Advancements in RAG Every Developer Should Know!

Pavan Belagatti 1 年前

Embracing Strict Mode in OpenAI: Revolutionizing…

PriceSenz 5 个月前

import datetime
import requests
from zoneinfo import ZoneInfo
from abc import ABC, abstractmethod

class Tool(ABC):
    @abstractmethod
    def name(self) -> str:
        pass

    @abstractmethod
    def description(self) -> str:
        pass

    @abstractmethod
    def use(self, *args, **kwargs):
        pass

Now, let's implement some tools:

class TimeTool(Tool):
    def name(self):
        return "Time Tool"

    def description(self):
        return "Provides the current time for a given city's timezone like Asia/Kolkata, America/New_York etc. If no timezone is provided, it returns the local time."

    def use(self, *args, **kwargs):        
        format = "%Y-%m-%d %H:%M:%S %Z%z"
        current_time = datetime.datetime.now()        
        input_timezone = args[0]
        if input_timezone:
            print("TimeZone", input_timezone)            
            current_time =  current_time.astimezone(ZoneInfo(input_timezone))            
        return f"The current time is {current_time}."


class WeatherTool(Tool):
    def name(self):
        return "Weather Tool"

    def description(self):
        return "Provides weather information for a given location"

    def use(self, *args, **kwargs):
        location = args[0].split("weather in ")[-1]
        api_key = userdata.get("OPENWEATHERMAP_API_KEY")
        url = f"https://api.openweathermap.org/data/2.5/weather?q={location}&appid={api_key}&units=metric"
        response = requests.get(url)
        data = response.json()
        if data["cod"] == 200:
            temp = data["main"]["temp"]
            description = data["weather"][0]["description"]
            return f"The weather in {location} is currently {description} with a temperature of {temp}°C."
        else:
            return f"Sorry, I couldn't find weather information for {location}."

Now let's implement the all-important agent.

import requests
import json
import ast

class Agent:
    def __init__(self):        
        self.tools = []
        self.memory = []
        self.max_memory = 10

    def add_tool(self, tool: Tool):
        self.tools.append(tool)

    def json_parser(self, input_string):
      python_dict = ast.literal_eval(input_string)
      json_string = json.dumps(python_dict)
      json_dict = json.loads(json_string)

      if isinstance(json_dict, dict):
        return json_dict        

      raise "Invalid JSON response"    


    def process_input(self, user_input):
        self.memory.append(f"User: {user_input}")
        self.memory = self.memory[-self.max_memory:]

        context = "\n".join(self.memory)
        tool_descriptions = "\n".join([f"- {tool.name()}: {tool.description()}" for tool in self.tools])
        response_format = {"action":"", "args":""}

        prompt = f"""Context:
        {context}

        Available tools:
        {tool_descriptions}

        Based on the user's input and context, decide if you should use a tool or respond directly.
        Sometimes you might have to use multiple tools to solve user's input. You have to do that in a loop.
        If you identify a action, respond with the tool name and the arguments for the tool.
        If you decide to respond directly to the user then make the action "respond_to_user" with args as your response in the following format.

        Response Format: 
        {response_format}

        """

        response = self.query_llm(prompt)
        self.memory.append(f"Agent: {response}")

        response_dict = self.json_parser(response)

        # Check if any tool can handle the input
        for tool in self.tools:
            if tool.name().lower() == response_dict["action"].lower():
                return tool.use(response_dict["args"])

        return response_dict

    def query_llm(self, prompt):        
        api_key = userdata.get("OPENAI_API_KEY")
        headers = {
            "Content-Type": "application/json",
            "Authorization": f"Bearer {api_key}"
        }
        data = {
            "model": "gpt-4o-mini-2024-07-18",
            "messages": [{"role": "user", "content": prompt}],
            "max_tokens": 150
        }
        response = requests.post("https://api.openai.com/v1/chat/completions", headers=headers, data=json.dumps(data))
        final_response = response.json()['choices'][0]['message']['content'].strip()
        print("LLM Response ", final_response)
        return final_response

    def run(self):
      print("LLM Agent: Hello! How can I assist you today?")
      user_input = input("You: ")

      while True:
        if user_input.lower() in ["exit", "bye", "close"]:
          print("See you later!")
          break

        response = self.process_input(user_input)
        if isinstance(response, dict) and response["action"] == "respond_to_user":
          print("Reponse from Agent: ", response["args"])
          user_input = input("You: ")
        else:
          user_input = response

Finally, let's create the main script to run our tool-based agent:

from google.colab import userdata

def main():    
    agent = Agent()

    # Add tools to the agent
    agent.add_tool(TimeTool())    
    agent.add_tool(WeatherTool())
    agent.run()   

if __name__ == "__main__":
    main()

How It Works

The Agent class manages the overall operation of the agent.
Each tool is implemented as a separate class inheriting from the Tool abstract base class.
The agent checks if any tool can handle the user's input directly.
If no tool matches, the agent uses the LLM to decide whether to use a tool or respond directly.
The LLM is provided with context, including the conversation history and available tools.

Adding New Tools

To add a new tool to the agent, simply create a new class that inherits from Tool and implement the required methods. Then, add an instance of your new tool to the agent using the add_tool method.

Conclusion

You can further enhance this agent by:

Implementing more sophisticated tools
Adding error handling and input validation
Improving the LLM prompts for better tool selection
Implementing a more advanced memory system

Code Repository

GitHub - zahere-dev/basic-ai-agent: Experimental basic AI Agent without using any libraries.

Experimental basic AI Agent without using any libraries. - zahere-dev/basic-ai-agent

github.com

My previous articles

https://www.dhirubhai.net/newsletters/7162380998746234880/

The Adaptive Engineer

789 位关注者

Chirawat Chitpakdee

Technology Lead AI & Automation workflow | Expertise RPA and Automation | Expertise in Performance Testing

7 个月

Could you please suggest the code example for this scenario below I want to build a local AI chatbot that understands and responds in Thai. I already have a Thai language model, but its performance in function prediction is not satisfactory. Therefore, I need to integrate an orchestrator model, specifically the 'Functionary' from MeetKai, which excels in function prediction. My idea is to have the Thai LLM translate user queries into English, then send these translated queries to the orchestrator model. After receiving the response from the orchestrator model (including tool execution), the Thai LLM should summarize the response and deliver the answer back to the user in Thai.

Godwin Josh

Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer

7 个月

Building an AI agent from scratch without libraries necessitates a deep understanding of core concepts like symbolic reasoning and knowledge representation. Implementing a tool-based architecture requires careful consideration of interfaces and communication protocols between different modules, potentially leveraging techniques like message passing or event-driven architectures. How would you design a memory system that effectively handles both short-term and long-term information while maintaining computational efficiency in this agent?

1 次回应

查看更多评论

要查看或添加评论，请登录

Zahiruddin Tavargere的更多文章

Building a Multi-Agent System with OpenAI Agents SDK - Part 1

2025年3月16日

Building a Multi-Agent System with OpenAI Agents SDK - Part 1

OpenAI recently released their Agents SDK, a lightweight yet powerful framework for building multi-agent workflows…
Why I'm Going Back to Basics

2025年2月2日

Why I'm Going Back to Basics

As an engineer in the rapidly evolving field of AI, I don't just want to leverage GenAI APIs and build agents. Video…

1 条评论
How Uber Saved 140,000 Hours Monthly Using Generative AI Agents

2025年1月14日

How Uber Saved 140,000 Hours Monthly Using Generative AI Agents

Video The Problem at Hand Uber's data platform processes approximately 1.2 million interactive queries monthly, with…
A Deep Dive into Google's "Agents" White Paper

2025年1月10日

A Deep Dive into Google's "Agents" White Paper

Google's recent white paper on "Agents" has created quite a buzz. The paper explores the concept of AI agents and…

1 条评论
How the Definition of Full-Stack Development Will Evolve by 2025

2024年12月31日

How the Definition of Full-Stack Development Will Evolve by 2025

Today I want to share something I deeply believe will shape the future of software engineering. As we approach 2025…

1 条评论
Unlocking the Power of Dynamic Prompting with Jinja2

2024年12月22日

Unlocking the Power of Dynamic Prompting with Jinja2

Colab Notebook: colab.research.
How to Build a Price Monitoring Agent with Pydantic AI

2024年12月16日

How to Build a Price Monitoring Agent with Pydantic AI

Video Tutorial Keeping track of fluctuating product prices across e-commerce platforms can be a daunting task. Whether…

1 条评论
Building a Multi-Agent Orchestrator: A Step-by-Step Guide

2024年12月6日

Building a Multi-Agent Orchestrator: A Step-by-Step Guide

Today, we’re diving into an exciting project: creating a Multi-Agent Orchestrator. Thanks for reading The Adaptive…

1 条评论
Is This the Most Robust Agentic Intent Classifier Yet?

2024年11月26日

Is This the Most Robust Agentic Intent Classifier Yet?

This week, I showcase the Multi-Agent Orchestrator by AWS, a tool designed to streamline the development of intelligent…
AWS Just Released a New Multi-Agent AI Framework

2024年11月18日

AWS Just Released a New Multi-Agent AI Framework

Video I Posted This Week AWS Multi-Agent Orchestrator Amazon’s Multi-Agent Orchestrator is a framework designed to…

See all articles

How to Build an AI Agent Without Using Any Libraries: A Step-by-Step Guide

Zahiruddin Tavargere

Senior Principal Software Engineer@Dell | Opinions are my own

Video Tutorial

The Context

Prerequisites

Implementing the Tool-based Agent

领英推荐

How It Works

Adding New Tools

Conclusion

Code Repository

My previous articles

The Adaptive Engineer

789 位关注者

Zahiruddin Tavargere的更多文章

社区洞察

其他会员也浏览了

Integrating OpenAI APIs with ChatMotor.ai : A Retex Guide

Swayam: The STEPs Model of Prompting - Part II

How to Use ChatGPT API in Python?

The Evolution of Software-based Automation in the Age of Generative AI

Prompt-over-Code Agent Design Method

The Role of AI in Automating Code Writing and Debugging

Part 3 - Applied LLMs: How to Build a Cat Generative Dialogue Processor (CatGDP)

The Rise of AI-Powered Code Generation Tools: How Developers are Accelerating Workflow

Excellent Resources to Learn AI Prompt Engineering!

Dissecting Llama 3.1: A Deep Dive

Video Tutorial

The Context

Prerequisites

Implementing the Tool-based Agent

领英推荐

How It Works

Adding New Tools

Conclusion

Code Repository

My previous articles

The Adaptive Engineer

789 位关注者

Zahiruddin Tavargere的更多文章

Building a Multi-Agent System with OpenAI Agents SDK - Part 1

Why I'm Going Back to Basics

How Uber Saved 140,000 Hours Monthly Using Generative AI Agents

A Deep Dive into Google's "Agents" White Paper

How the Definition of Full-Stack Development Will Evolve by 2025

Unlocking the Power of Dynamic Prompting with Jinja2

How to Build a Price Monitoring Agent with Pydantic AI

Building a Multi-Agent Orchestrator: A Step-by-Step Guide

Is This the Most Robust Agentic Intent Classifier Yet?

AWS Just Released a New Multi-Agent AI Framework

社区洞察

其他会员也浏览了

Integrating OpenAI APIs with ChatMotor.ai : A Retex Guide

Swayam: The STEPs Model of Prompting - Part II

How to Use ChatGPT API in Python?

The Evolution of Software-based Automation in the Age of Generative AI

Prompt-over-Code Agent Design Method

The Role of AI in Automating Code Writing and Debugging

Part 3 - Applied LLMs: How to Build a Cat Generative Dialogue Processor (CatGDP)

The Rise of AI-Powered Code Generation Tools: How Developers are Accelerating Workflow

Excellent Resources to Learn AI Prompt Engineering!

Dissecting Llama 3.1: A Deep Dive