Introducing Agentic DevOps
A fully autonomous, AI-powered DevOps platform for managing cloud infrastructure across multiple providers, with AWS and GitHub integration, powered by OpenAI's Agents SDK.
Agentic DevOps represents the next step in infrastructure management, a fully autonomous system that doesn't just assist with DevOps tasks but can independently plan, execute, and optimize your entire infrastructure lifecycle.
Built on the foundation of OpenAI's Agents SDK, this platform goes beyond traditional automation by incorporating true AI-driven decision-making capabilities.
? Try it Here: agentic-devops.fly.dev
?? Github Repo: https://github.com/agenticsorg/devops
?? Support Agentics Foundation: https://agentics.org/memberships
The system can autonomously:
Agentic DevOps serves as an intelligent co-pilot for your infrastructure, or even as a fully autonomous operator, understanding complex requirements, executing precise commands, adapting to changing conditions, and providing valuable insights across your entire DevOps workflow. Whether you're managing AWS resources, working with GitHub repositories, or orchestrating complex deployments, Agentic DevOps provides a unified, intelligent interface that simplifies these tasks while maintaining security and best practices.
Overview
Agentic DevOps is designed to transform cloud infrastructure management through autonomous operation and intelligent decision-making. It provides a consistent interface for working with various cloud providers and services while adding a layer of AI-driven automation that can operate independently when needed.
Key benefits include:
Features & Core Capabilities
Autonomous Infrastructure Management: AI-driven management of cloud resources
AI-Powered Assistance: Leverage OpenAI's capabilities
Multi-Cloud Support: Consistent interface across providers
Security and Compliance:
Observability and Monitoring:
Deployment Automation:
Disaster Recovery:
Installation
# Clone the repository
git clone https://github.com/agenticsorg/devops.git
cd devops
# Install dependencies
pip install -r requirements.txt
# Configure credentials
cp env.example .env
# Edit .env with your AWS, GitHub, and OpenAI credentials
Configuration
The DevOps Agent supports multiple configuration methods:
Example configuration file (config.yaml):
aws:
region: us-west-2
profile: devops-agent
default_vpc: vpc-1234567890abcdef0
github:
organization: your-organization
default_branch: main
openai:
model: gpt-4o
temperature: 0.2
logging:
level: INFO
file: devops-agent.log
Usage
Python API
from devops_agent.aws.ec2 import EC2Service
from devops_agent.aws.s3 import S3Service
from devops_agent.github import GitHubService
from devops_agent.core.context import DevOpsContext
# Initialize context
context = DevOpsContext(
user_id="user123",
aws_region="us-west-2",
github_org="your-organization"
)
# Initialize services
ec2 = EC2Service(context=context)
s3 = S3Service(context=context)
github = GitHubService(context=context)
# List EC2 instances
instances = ec2.list_instances(filters=[{"Name": "instance-state-name", "Values": ["running"]}])
print(f"Found {len(instances)} running EC2 instances")
# Create S3 bucket with encryption
bucket = s3.create_bucket(
name="my-secure-bucket",
region="us-west-2",
encryption={"algorithm": "AES256"},
versioning=True
)
# Deploy from GitHub to EC2
ec2.deploy_from_github(
instance_id="i-1234567890abcdef0",
repository="your-org/your-repo",
branch="main",
deploy_path="/var/www/html",
setup_script="scripts/setup.sh",
environment_variables={"ENV": "production"}
)
CLI Usage
The DevOps Agent provides a powerful command-line interface with rich output formatting:
# List EC2 instances with filtering and formatting
devops ec2 list-instances --state running --region us-west-2 --output table
# Create an EC2 instance with detailed configuration
devops ec2 create-instance \
--name "web-server" \
--type t3.medium \
--ami-id ami-0c55b159cbfafe1f0 \
--subnet-id subnet-1234567890abcdef0 \
--security-group-ids sg-1234567890abcdef0 \
--key-name my-key \
--user-data-file startup-script.sh \
--tags "Environment=Production,Project=Website" \
--wait
# Get GitHub repository details with specific information
devops github get-repo your-org/your-repo --output json
# Create a GitHub issue with labels and assignees
devops github create-issue \
--repo your-org/your-repo \
--title "Update dependencies" \
--body "We need to update all dependencies to the latest versions." \
--labels "maintenance,dependencies" \
--assignees "username1,username2"
# Deploy from GitHub to EC2 with advanced options
devops deploy github-to-ec2 \
--repo your-org/your-repo \
--instance-id i-1234567890abcdef0 \
--branch develop \
--path /var/www/html \
--setup-script scripts/setup.sh \
--env-file .env.production \
--post-deploy-hook scripts/notify.sh
OpenAI Agents Integration
The DevOps Agent leverages OpenAI's Agents SDK to provide powerful AI-driven infrastructure management capabilities. This integration enables natural language interactions with your cloud resources, intelligent automation, and context-aware assistance.
Key Benefits of OpenAI Agents Integration
Agent Architecture
The DevOps Agent uses a modular architecture with specialized agents for different domains:
Each agent is equipped with domain-specific tools and knowledge, allowing for deep expertise in their respective areas while maintaining a unified interface for the user.
Basic Usage Example
from agents import Agent, Runner
from devops_agent.agents.tools import (
list_ec2_instances,
start_ec2_instances,
stop_ec2_instances,
create_ec2_instance
)
from devops_agent.core.context import DevOpsContext
# Create a context with user information
context = DevOpsContext(
user_id="user123",
aws_region="us-west-2",
github_org="your-organization"
)
# Create an EC2-focused agent
ec2_agent = Agent(
name="EC2 Assistant",
instructions="""
You are an EC2 management assistant that helps users manage their AWS EC2 instances.
You can list, start, stop, and create EC2 instances based on user requests.
Always confirm important actions before executing them and provide clear explanations.
""",
tools=[
list_ec2_instances,
start_ec2_instances,
stop_ec2_instances,
create_ec2_instance
],
model="gpt-4o"
)
# Run the agent with a user query
result = Runner.run_sync(
ec2_agent,
"I need to launch 3 t2.micro instances for a web application in us-west-2. They should have the tag 'Project=WebApp'.",
context=context
)
print(result.final_output)
Advanced Agent Orchestration
For more complex workflows, you can use agent orchestration to coordinate between specialized agents:
from agents import Agent, Runner, Handoff
from devops_agent.agents.tools import (
# EC2 tools
list_ec2_instances,
start_ec2_instances,
stop_ec2_instances,
create_ec2_instance,
# S3 tools
list_s3_buckets,
create_s3_bucket,
# GitHub tools
get_github_repository,
list_github_issues,
create_github_issue,
# Deployment tools
deploy_to_ec2
)
# Create specialized agents
ec2_agent = Agent(
name="EC2 Agent",
instructions="You are an EC2 management specialist...",
tools=[list_ec2_instances, start_ec2_instances, stop_ec2_instances, create_ec2_instance],
model="gpt-4o-mini"
)
s3_agent = Agent(
name="S3 Agent",
instructions="You are an S3 management specialist...",
tools=[list_s3_buckets, create_s3_bucket],
model="gpt-4o-mini"
)
github_agent = Agent(
name="GitHub Agent",
instructions="You are a GitHub management specialist...",
tools=[get_github_repository, list_github_issues, create_github_issue],
model="gpt-4o-mini"
)
deployment_agent = Agent(
name="Deployment Agent",
instructions="You are a deployment specialist...",
tools=[deploy_to_ec2],
model="gpt-4o-mini"
)
# Create an orchestrator agent that can delegate to specialized agents
orchestrator = Agent(
name="DevOps Orchestrator",
instructions="""
You are a DevOps orchestrator that helps users manage their cloud infrastructure and code repositories.
You can delegate tasks to specialized agents for EC2, S3, GitHub, and deployments.
Determine which specialized agent is best suited for each user request and hand off accordingly.
""",
handoffs=[
Handoff(agent=ec2_agent, description="Handles EC2 instance management tasks"),
Handoff(agent=s3_agent, description="Handles S3 bucket operations"),
Handoff(agent=github_agent, description="Handles GitHub repository management"),
Handoff(agent=deployment_agent, description="Handles deployment workflows")
],
model="gpt-4o-mini"
)
# Run the orchestrator with a complex query
result = Runner.run_sync(
orchestrator,
"""
I need to set up a new web application deployment:
1. Create 2 t2.micro EC2 instances with the tag 'Project=WebApp'
2. Create an S3 bucket for static assets with versioning enabled
3. Clone our 'company/webapp' GitHub repository to the EC2 instances
4. Create a GitHub issue to track this deployment
""",
context=context
)
print(result.final_output)
Asynchronous Agent Execution
For high-performance applications, you can use asynchronous execution:
import asyncio
from agents import Runner
async def run_agent_async():
result = await Runner.run(
ec2_agent,
"List all my EC2 instances in us-west-2 and show their status",
context=context
)
return result.final_output
# Run the agent asynchronously
response = asyncio.run(run_agent_async())
print(response)
Security Guardrails
The DevOps Agent includes built-in security guardrails to prevent destructive operations:
from devops_agent.core.guardrails import (
security_guardrail,
sensitive_info_guardrail
)
# Apply security guardrail to check for potentially harmful operations
@security_guardrail
def perform_operation(operation_details):
# Implementation
pass
# Apply sensitive information guardrail to prevent leaking credentials
@sensitive_info_guardrail
def generate_response(user_query, system_data):
# Implementation
pass
Tracing and Debugging
For debugging and monitoring agent behavior, you can use the tracing functionality:
from agents.tracing import set_tracing_enabled, get_trace
# Enable tracing
set_tracing_enabled(True)
# Run the agent
result = Runner.run_sync(ec2_agent, "List my EC2 instances", context=context)
# Get the trace for analysis
trace = get_trace()
print(f"Agent took {len(trace.steps)} steps to complete the task")
for step in trace.steps:
print(f"Step: {step.type}, Duration: {step.duration}ms")
Advanced Configuration
Credential Management
The DevOps Agent provides multiple secure options for credential management:
Example keyring setup:
from devops_agent.core.credentials import CredentialManager
# Store credentials securely
cred_manager = CredentialManager()
cred_manager.store_aws_credentials(
access_key="YOUR_ACCESS_KEY",
secret_key="YOUR_SECRET_KEY",
region="us-west-2",
profile_name="production"
)
cred_manager.store_github_credentials(
token="YOUR_GITHUB_TOKEN",
username="your-username"
)
# Retrieve credentials securely
aws_creds = cred_manager.get_aws_credentials(profile_name="production")
github_creds = cred_manager.get_github_credentials()
Error Handling and Logging
The DevOps Agent provides comprehensive error handling with actionable suggestions:
from devops_agent.core.logging import setup_logging
from devops_agent.aws.base import AWSServiceError, ResourceNotFoundError
# Setup logging
logger = setup_logging(level="INFO", log_file="devops-agent.log")
try:
# Attempt to perform an operation
ec2.start_instance(instance_id="i-nonexistentid")
except ResourceNotFoundError as e:
# Handle specific error with context
logger.error(f"Could not find instance: {e}")
logger.info(f"Suggestion: {e.suggestion}")
# Take remedial action
except AWSServiceError as e:
# Handle general AWS errors
logger.error(f"AWS operation failed: {e}")
logger.info(f"Suggestion: {e.suggestion}")
Extensibility
The DevOps Agent is designed to be easily extended with new services and providers:
Example of creating a custom service:
from devops_agent.aws.base import AWSBaseService
class CustomService(AWSBaseService):
"""Custom service implementation."""
SERVICE_NAME = "custom-service"
def __init__(self, credentials=None, region=None):
super().__init__(credentials, region)
# Initialize service-specific resources
def custom_operation(self, param1, param2):
"""Implement custom operation."""
try:
# Implement operation logic
result = self._client.some_operation(
Param1=param1,
Param2=param2
)
return self._format_response(result)
except Exception as e:
# Handle and transform errors
self.handle_error(e, "custom_operation")
Creating Custom Agent Tools
You can extend the agent's capabilities by creating custom tools:
from agents import function_tool
from pydantic import BaseModel, Field
from devops_agent.core.context import DevOpsContext, RunContextWrapper
# Define the input schema for your tool
class CustomOperationInput(BaseModel):
resource_id: str = Field(..., description="The ID of the resource to operate on")
operation_type: str = Field(..., description="The type of operation to perform")
parameters: dict = Field(default={}, description="Additional parameters for the operation")
# Create a function tool
@function_tool()
async def custom_operation(
wrapper: RunContextWrapper[DevOpsContext],
input_data: CustomOperationInput
) -> dict:
"""
Perform a custom operation on a specified resource.
Args:
resource_id: The ID of the resource to operate on
operation_type: The type of operation to perform (e.g., "analyze", "optimize", "backup")
parameters: Additional parameters specific to the operation type
Returns:
A dictionary containing the operation results
"""
# Access the context
context = wrapper.context
# Implement your custom logic
result = {
"resource_id": input_data.resource_id,
"operation_type": input_data.operation_type,
"status": "completed",
"details": {
"timestamp": "2023-01-01T00:00:00Z",
"user": context.user_id,
"region": context.aws_region,
"parameters": input_data.parameters
}
}
return result
Testing
The DevOps Agent includes comprehensive testing capabilities:
# Run all tests
python run_all_tests.py
# Run specific test categories
python -m pytest tests/aws/
python -m pytest tests/github/
python -m pytest tests/test_cli.py
# Run tests with specific markers
python -m pytest -m "aws"
python -m pytest -m "integration"
python -m pytest -m "unit"
AI Strategy to Implementation | AI & Data Leader | Experienced CIO & CTO | Building Innovative Enterprise AI solutions | Responsible AI | Top LinkedIn AI voice
32 分钟前Kudos for pushing the boundaries with Agentic DevOps. Building an AI-native system that manages the entire cloud lifecycle is no small feat. The concept of intent-driven provisioning and self-healing infrastructure is a strong step toward autonomous operations. That said, the real challenge will be in its decision-making. Infrastructure management often requires nuanced judgment - balancing performance, cost, security, and compliance. While AI agents can optimize for metrics and respond to anomalies, how well can they assess trade-offs in ambiguous scenarios? For example, deciding whether to delay a critical deployment to avoid potential downtime, or prioritizing security over speed when vulnerabilities are detected. Human engineers bring experience, intuition, and context that algorithms may struggle to replicate. Without robust guardrails and escalation mechanisms, autonomous decisions could introduce unintended risks. It’ll be interesting to see how Agentic DevOps manages these complexities and whether it truly complements human oversight or falls short in critical moments. Would love to hear from anyone who’ve tested it - how well does it handle those high-stakes decisions in real-world scenarios?
I help software organizations improve resilience and achieve operational excellence | Former Principal Engineer at AWS
3 小时前I wrote about AI meta-operator a month ago, but it happened faster than I expected. Congrats Reuven Cohen I am curious to hear what you think about nondeterministic behavior, hidden complexity, etc. Great stuff! https://medium.com/@adhorn/when-ai-makes-the-call-b10b094e1b8f
*OPS / CISSP
7 小时前AWESOME!
Hi Reuven, how does Agentic DevOps compare to products like Humanitec which aim to standardise & automate Internal Developer Platforms (IDPs) ?
Awesome product. Does it also work with GCP?