登录查看更多内容

Building a Technical Content Writing Agent Using Swarm: A Step-by-Step Guide

Zahiruddin Tavargere

Senior Principal Software Engineer@Dell | Opinions are my own

发布日期: 2024年10月21日

In this blog post, we’ll delve into a new multi-agent framework called Swarm, developed by OpenAI, which has recently garnered significant attention within the AI community.

What we are building

The goal is to explore how to use Swarm by building a simple agent that takes a user query, performs detailed research, and generates a structured blog post based on the findings.

This will help you understand the capabilities of Swarm, especially in orchestrating tasks among multiple agents.

What is Swarm?

Swarm is a multi-agent orchestration framework introduced by OpenAI. Although currently labeled as experimental and educational, it has quickly become a favorite among developers, earning praise as one of the best frameworks for building and coordinating multi-agent systems.

Key Concepts of the Swarm Framework

Swarm's design is centered around two fundamental components:

Agents: These are autonomous units that operate with a set of instructions and make decisions independently.

Each agent can specialize in a particular task, such as gathering data or writing content.

agent = Agent(
   instructions="You are a helpful agent."
)

Handoffs: This mechanism allows one agent to transfer control of a task or conversation to another agent.

It enables seamless collaboration between agents, ensuring smooth execution of complex workflows.

sales_agent = Agent(name="Sales Agent")

def handoff_to_sales():
   return sales_agent

agent = Agent(functions=[transfer_to_sales])

response = client.run(agent, [{"role":"user", "content":"Transfer me to sales."}])
print(response.agent.name)

By leveraging these components, Swarm helps developers create systems where tasks can be broken down into smaller, specialized sub-tasks, handled by different agents.

Use Case: Creating a Technical Content Writing Agent

Code: https://colab.research.google.com/drive/1phDFUasrZxjChabWo_oWwWuNKwfjuJ_B?authuser=1#scrollTo=zyxmU_W9ZCxp

To illustrate the potential of Swarm, let’s build a multi-agent system that generates a blog post based on user input. The system comprises three agents:

Interface Agent: Interacts with the user, refines the query if needed, and passes the query to the Researcher Agent.
Researcher Agent: Conducts detailed research on the query and prepares a research report.
Blogger Agent: Uses the research report to create a structured blog post.

The workflow is simple: the user provides a theme or query, such as "Top 5 Technical Skills for 2025." The Interface Agent refines the query if necessary, then hands it over to the Researcher Agent, which gathers and analyzes relevant information. Finally, the Blogger Agent compiles the research into a well-written blog post.

How the Swarm Framework Works

Illustration Overview

Below is a high-level overview of the agent workflow:

User Input: The user provides a theme, e.g., "Why is R fast?".
Interface Agent: Takes the user’s input, asks clarifying questions if needed, and then passes the refined query to the Researcher Agent.
Researcher Agent: Gathers data from search engines, scrapes content from top results, and analyzes the information to produce a research report.
Blogger Agent: Converts the research report into a blog post and returns it to the Interface Agent.
Interface Agent: Delivers the completed blog post to the user.

This setup demonstrates how Swarm’s agent-based architecture simplifies the process of delegating tasks and orchestrating their execution.

Building the Swarm Agents

Let’s dive into the code, which consists of the three agents and their respective functions.

Agent Class Structure

Each agent is defined as a class with parameters like name, model, instructions, and functions. The functions include methods for interacting with APIs, databases, and other agents. A key method is the handoff method, which allows one agent to pass control to another:

Researcher Agent

The Researcher Agent uses a package called GPT Researcher to gather and analyze content from various search engines.

This agent performs a deep dive into the topic provided by the user

import nest_asyncio # required for notebooks
nest_asyncio.apply()

from gpt_researcher import GPTResearcher
import asyncio

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

async def get_report(query: str) -> str:
    report_type = "research_report"
    researcher = GPTResearcher(query, report_type)
    research_result =  await researcher.conduct_research()
    report =  await researcher.write_report()

    # Get additional information
    research_context = researcher.get_research_context()
    research_costs = researcher.get_costs()
    research_images = researcher.get_research_images()
    research_sources = researcher.get_research_sources()

    return {'report':report}

def research_topic(query: str) -> str:
  """Generate research report"""
  return asyncio.run(get_report(query))

def handoff_to_researcher():
    """Hand off the user query to the researcher agent."""
    print("Handing off to Researcher Agent")
    return researcher_agent

researcher_agent = Agent(
    name="Researcher Agent",
    model="gpt-4o-mini",
    instructions="You are a researcher agent specialized in researching. If you are satisfied with the research, handoff the report to blogger",
    functions=[research_topic, handoff_to_blogger],
)

The handoff method transfers control to the Blogger Agent once the research is complete.

3. Blogger Agent

The Blogger Agent creates a blog post based on the research report. It uses the language model to generate a well-structured article:

def generate_completion(role, task, content):
    """Generate a completion using OpenAI."""
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": f"You are a {role}. {task}"},
            {"role": "user", "content": content}
        ]
    )
    return response.choices[0].message.content

def handoff_to_blogger():
    """Hand off the research report to the blogger agent."""
    print("Handing off to Blogger Agent")
    return blogger_agent

def generate_blog_content(research_data):
    """Generate technical blog content on research report using OpenAI."""
    content = generate_completion(
        "Technical Content Creator",
        "Create compelling technical content for a blog based on the following research report.",
        research_data
    )
    return {"content": content}


blogger_agent = Agent(
    name="Blogger Agent",
    model="gpt-4o-mini",
    instructions="You are a top technical blogger agent specialized in creating compelling technical content for blogs based on research report. Be concise.",
    functions=[generate_blog_content],
)

4. Interface Agent

The Interface Agent communicates with the user, refines the query, and then initiates the workflow by handing control over to the Researcher Agent:

领英推荐

How to Bypass Originality.AI with Humanizer Pro

Parul Gautam 4 个月前

13 AI Tools That You Wish to Use as a Content Writer…

Shailesh Shakya 2 年前

Best AI Writing Generators in July 2024

Ghulam Akbar 8 个月前

user_interface_agent = Agent(
    name="User Interface Agent",
    model="gpt-4o-mini",
    instructions="You are a user interface agent that handles all interactions with the user. You need to always start with a theme or topic that the user wants to research. Ask clarification questions if needed. Be concise.",
    functions=[handoff_to_researcher],
)

Running the System

With the agents defined, you can now run the system:

Install the necessary packages:
Initialize the agents and run the user input loop:

For the theme "Top 5 Technical Skills for 2025," the system will generate a structured blog post by conducting research and formatting the findings into readable content.

Example Output: Blog Post Generated by the Agent

Here's an example of what the generated blog post might look like:

Essential Skills for Software Developers: Staying Competitive in 2025

As we approach 2025, the pace of technological advancement presents both opportunities and challenges for software developers. The landscape is evolving with new innovations in artificial intelligence, cloud computing, and data analytics, among others. To remain competitive and relevant in the job market, developers must cultivate certain technical skills that are poised to dominate the industry. Here’s a detailed look at the top five technical skills every software developer should focus on in the coming years.

1. Mastering Artificial Intelligence and Machine Learning

Artificial Intelligence (AI) and Machine Learning (ML) are no longer just buzzwords; they are core components of modern software applications. Industries from healthcare to finance are leveraging AI to automate processes, analyze data, and enhance customer experiences.

What to Learn:

Programming languages: Python is the go-to language for AI and ML, alongside familiarity with frameworks like TensorFlow and PyTorch.
Key areas: Focus on natural language processing (NLP) and computer vision, as they will continue to grow in demand.
Real-world application: Engage in projects that allow you to implement algorithms and create systems that learn and adapt based on data.

2. Embracing Cloud Computing and Serverless Architecture

The migration to cloud computing is reshaping how businesses operate. Proficiency in cloud platforms such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform is becoming a mandatory skill for developers. Furthermore, an understanding of serverless architecture, which eliminates the need to manage servers, is a differentiator that can elevate your career.

What to Learn:

Cloud platforms: Gain hands-on experience with major cloud services and their offerings.
Serverless frameworks: Explore platforms like AWS Lambda to understand how to deploy applications more swiftly.
Collaboration: Familiarize yourself with DevOps practices to enhance collaboration in software development and operations.

3. Building Cybersecurity Fundamentals

With the surge in cyber threats, understanding cybersecurity is critical for developers. Knowledge of secure coding practices, data encryption, and vulnerability assessment is essential for protecting applications and sensitive user information.

What to Learn:

Security principles: Study secure coding standards and tools for threat modeling and incident response.
Sandbox testing: Experiment with penetration testing frameworks to gain practical knowledge.
Developing secure code: Integrate security considerations into your day-to-day coding practices and workflow.

4. Capitalizing on Low-Code and No-Code Development

Low-code and no-code platforms are transforming application development by making it accessible to non-developers. These platforms enable rapid application development, which can markedly accelerate a company’s digital transformation efforts.

What to Learn:

Tools and platforms: Explore tools like OutSystems, Mendix, or Bubble to understand how they work and their capabilities.
Integration skills: Learn how to integrate low-code solutions with traditional back-end systems.
Process optimization: Understand how these platforms can streamline workflows and provide rapid iterations.

5. Diving Into Data Science and Analytics

In our data-centric world, the ability to analyze and interpret data is invaluable. Understanding data science principles can empower developers to turn insights into actionable strategies.

What to Learn:

Data tools and languages: Gain proficiency in SQL, R, and Python for data manipulation and visualization.
Statistical analysis: Understand key concepts of statistics and machine learning to derive meaningful insights from data.
Predictive analysis: Use machine learning algorithms to forecast trends and contribute to data-driven decision-making processes.

Conclusion

As technology continues to change at a rapid pace, developers must engage in continuous learning to stay up to date. The top five skills outlined above—AI and ML, cloud computing and serverless architecture, cybersecurity fundamentals, low-code/no-code development, and data science—are critical for any developer aiming to thrive in the industry by 2025.

Investing in these skills will not only enhance your employability but also equip you to contribute effectively to innovative projects that shape the technology landscape. Embrace this opportunity to upgrade your expertise and solidify your place in the future of software development.

References

Hadalgi, N. (2024). The Most In-Demand Programming Skills for 2025: Staying Ahead in a Rapidly Evolving Tech Landscape. LinkedIn.
Teal HQ. (2024). Top Skills for Software Developers in 2024 (+Most Underrated Skills). Teal HQ.

Educative. (2023). Top Software Developer Skills To Learn in 2024. Educative.

Conclusion

The Swarm framework simplifies the creation of complex multi-agent systems, allowing developers to break down intricate tasks into manageable components.

By combining specialized agents for tasks like research and content creation, developers can automate time-consuming processes, such as generating technical blog posts from user queries.

With this guide, you now have a foundational understanding of Swarm’s capabilities. The next step is to explore more advanced use cases, such as orchestrating agents for more complex workflows.

Stay tuned for future posts where we dive deeper into Swarm and its applications in AI development!

The Adaptive Engineer

790 位关注者

Zahiruddin Tavargere

Senior Principal Software Engineer@Dell | Opinions are my own

5 个月

Swarm repo link below https://github.com/openai/swarm/tree/main

要查看或添加评论，请登录

Zahiruddin Tavargere的更多文章

Building a Multi-Agent System with OpenAI Agents SDK - Part 1

2025年3月16日

Building a Multi-Agent System with OpenAI Agents SDK - Part 1

OpenAI recently released their Agents SDK, a lightweight yet powerful framework for building multi-agent workflows…
Why I'm Going Back to Basics

2025年2月2日

Why I'm Going Back to Basics

As an engineer in the rapidly evolving field of AI, I don't just want to leverage GenAI APIs and build agents. Video…

1 条评论
How Uber Saved 140,000 Hours Monthly Using Generative AI Agents

2025年1月14日

How Uber Saved 140,000 Hours Monthly Using Generative AI Agents

Video The Problem at Hand Uber's data platform processes approximately 1.2 million interactive queries monthly, with…
A Deep Dive into Google's "Agents" White Paper

2025年1月10日

A Deep Dive into Google's "Agents" White Paper

Google's recent white paper on "Agents" has created quite a buzz. The paper explores the concept of AI agents and…

1 条评论
How the Definition of Full-Stack Development Will Evolve by 2025

2024年12月31日

How the Definition of Full-Stack Development Will Evolve by 2025

Today I want to share something I deeply believe will shape the future of software engineering. As we approach 2025…

1 条评论
Unlocking the Power of Dynamic Prompting with Jinja2

2024年12月22日

Unlocking the Power of Dynamic Prompting with Jinja2

Colab Notebook: colab.research.
How to Build a Price Monitoring Agent with Pydantic AI

2024年12月16日

How to Build a Price Monitoring Agent with Pydantic AI

Video Tutorial Keeping track of fluctuating product prices across e-commerce platforms can be a daunting task. Whether…

1 条评论
Building a Multi-Agent Orchestrator: A Step-by-Step Guide

2024年12月6日

Building a Multi-Agent Orchestrator: A Step-by-Step Guide

Today, we’re diving into an exciting project: creating a Multi-Agent Orchestrator. Thanks for reading The Adaptive…

1 条评论
Is This the Most Robust Agentic Intent Classifier Yet?

2024年11月26日

Is This the Most Robust Agentic Intent Classifier Yet?

This week, I showcase the Multi-Agent Orchestrator by AWS, a tool designed to streamline the development of intelligent…
AWS Just Released a New Multi-Agent AI Framework

2024年11月18日

AWS Just Released a New Multi-Agent AI Framework

Video I Posted This Week AWS Multi-Agent Orchestrator Amazon’s Multi-Agent Orchestrator is a framework designed to…

See all articles

What we are building

What is Swarm?

Key Concepts of the Swarm Framework

Use Case: Creating a Technical Content Writing Agent

How the Swarm Framework Works

Illustration Overview

Building the Swarm Agents

Agent Class Structure

Researcher Agent

3. Blogger Agent

4. Interface Agent

领英推荐

Running the System

Example Output: Blog Post Generated by the Agent

Essential Skills for Software Developers: Staying Competitive in 2025

1. Mastering Artificial Intelligence and Machine Learning

2. Embracing Cloud Computing and Serverless Architecture

3. Building Cybersecurity Fundamentals

4. Capitalizing on Low-Code and No-Code Development

5. Diving Into Data Science and Analytics

Conclusion

References

Conclusion

The Adaptive Engineer

790 位关注者

Zahiruddin Tavargere的更多文章

Building a Multi-Agent System with OpenAI Agents SDK - Part 1

Why I'm Going Back to Basics

How Uber Saved 140,000 Hours Monthly Using Generative AI Agents

A Deep Dive into Google's "Agents" White Paper

How the Definition of Full-Stack Development Will Evolve by 2025

Unlocking the Power of Dynamic Prompting with Jinja2

How to Build a Price Monitoring Agent with Pydantic AI

Building a Multi-Agent Orchestrator: A Step-by-Step Guide

Is This the Most Robust Agentic Intent Classifier Yet?

AWS Just Released a New Multi-Agent AI Framework

社区洞察

其他会员也浏览了

DeepL Write: The Ultimate AI Writing Assistant Tool

6 Best AI Writing Tools in 2024

?? The $10M Generative Writing Blueprint – The Ultimate Weapon for Writers Who Refuse to Be Left Behind

Wordplay Lifetime Deal Review - Long-Form AI Writer

Just because IT can, doesn't mean IT should: 4 risks of AI content writing

Jasper.ai Software Evaluation

Best AI Tool for Content Writing ?? Best AI writers of 2023

20 AI Tools For Writers - Not All Writers Use It And It Is Bad Luck For Them

Which Is the Best AI Blog Writing Tool?

3 Best Value AI Writing Software Tools In 2024 (Free Options)