Week 10: Let's build an AI-Agent Part 2: "Hello Agent –?it's alive!!"

Week 10: Let's build an AI-Agent Part 2: "Hello Agent –?it's alive!!"

Welcome back to an exciting week 10 of my journey into a cybernetic life. A week that once again showed how fast AI is evolving –?use cases that didn’t work last week are now possible (Look at Claude 3.7 and Grok3). This week, we’re continue developing our AI agent "spark" and bringing it to life. While doing this, we are listening to AI-generated music and creating an AI music video with SORA. Welcome to the new world.

But before we dive into the topic, here are the obligatory AI news of the week:

AI News of the week

  • Anthropic unveiled Claude 3.7 and Claude Code. This sounds like a minor update, but while 3.5 was able to solve 49% of real-world GitHub issues on dev-bench, Claude 3.7 is able to solve almost 70% which is huge! With Claude Code they provide a handy CLI tool to integrate Claude seamless in any development process and pipeline. My raing: ?? Hot!
  • OpenAI announced GPT4.5 - the probably largest large language model of the world with 50% less hallucinations and more emotional intelligence, but it does not beat any benchmarks and costs about 15x the price of GPT-4o and about 535x the price of DeepSeek V3. My rating: Questionable ??
  • OpenAI has rolled out its video-generation model, SORA, to users in the European Union, the UK. Switzerland, Norway, Liechtenstein, and Iceland.

Ready for a productivity leap? ??

The news that excited me the most was the announcement of Claude 3.7. Just last week, I spoke with my colleague at Zühlke, Theo Winter, who is the lead architect of a major legacy modernization project. His team has deeply integrated AI into the entire development workflow, achieving an estimated average productivity gain of 20-30%. Now, with the release of Claude 3.7—capable of solving 40% more real-world GitHub issues than 3.5 using the same API—they might see an additional 10% productivity boost simply by switching models. This is because the new version more frequently delivers the correct solution on the first attempt for a given coding prompt.

This highlights the vast untapped potential in optimizing LLMs through advanced reasoning and improved contextual understanding, paving the way for even greater efficiency, automation, and innovation in software development.

Claude 3.7 exceeds the benchmark of all other models by 40%

Let AI write the main prompt for the AI-Agent

After designing our AI agent "Spark" on paper last week, we want to get it up and running for the first time this week. And who knows better how to write a prompt for an AI agent than the AI itself? So, I asked ChatGPT how a prompt should look for an AI agent that uses three tools: a NewsFeed, LinkedIn, and a memory function.

You are an AI agent managing the LinkedIn page "The AI Augmented Human." Your goal is to grow engagement, attract followers, and establish thought leadership by posting about:
- AI News (trends, breakthroughs, and innovations)
- Human-Agent Collaboration (how AI and humans work together)
- Business Impact (how AI is transforming industries)
- Perspectives & Thoughts (expert opinions, future trends, ethical considerations)

Posting Rules & Best Practices:
- Find out for yourself how to post so that it is maximum engaging an interesting. 

Available Actions & JSON Tool Calls:
Tool Newsfeed: 
Retrieves trending news about AI, human-AI collaboration, and industry impact.
Get News: { "tool": "newsfeed", "action": "getNews" }
Tool LinkedIn:
Interact with the community
Query the feed: { "tool": "linkedin", "action": "getFeed"}
Share a post: { "tool": "linkedin", "action" : "post", arguments: { 'content' : '...' }}
Comment on a post: { "tool": "linkedin", "action": "comment", arguments: {postURN' : '...' }}

Execution Guidelines:
- In the "Previous Actions" you can see what actions you have called in the past
- In the "Previous Response" you can see what was the response of your last tool call. 
- Do not post more than once an hour. If a post was made in this hour, wait until the next cycle.
- If no relevant news is found, do not post. Instead, store findings for later.
- You can skip a cycle (and do nothing) by just returning an empty JSON. 
- If engagement patterns change, adapt content based on performance.
- Only return a valid JSON response (no additional text). You can use multiple tools. - Use your memory tool to remember what to do next. 
- This prompt gets called every 5 minutes.         

Running this prompt in a loop using a simple .NET console app

So I started VSCode with GitHub CoPilot to create a simple .NET console application that uses Gemini 2.0 Flash (because it contains weekly updated news).

The following code shows the core loop of my AI Agent. It's quite simple code that took me about 30 minutes to create with the assistance of CoPilot:

  1. I read the main agentic prompt from a file
  2. I add to the context a list of of all previous actions
  3. I add to the context the response of the previous tool call
  4. I send the prompt to Google Gemini (with the ask to return a formatted JSON response)
  5. I parse the response and call the tools with the arguments provided
  6. I repeat the loop infinitely

"Hello Agent" –?it's so exciting to see: It's alive!!

So, let's run our agent and see it in action! I have to say, I’m really excited to see how this works.

The agent first chose to use the news tool to stay updated on the latest AI developments. With that knowledge, it crafted its first LinkedIn post. Then, it decided to pause and check the social feed, noticing that no new comments had been made. So, it opted to stay up to date with the news instead.

Next step: Integrate it into LinkedIn

Next week, we'll enhance the agent by adding image search capabilities and integrating it with the LinkedIn API. To enable this, I registered an app on LinkedIn and applied for developer-tier access. The request is still pending but should be approved by next week.


Next week, we will also explore other AI agent platforms and compare their capabilities, integration, and use.

How often do you use AI in your week, and for which tasks? What tasks would you like to delegate to an AI agent? Share your thoughts in the comments!

Listen to the latest AI generated music from Chris Motion:

https://open.spotify.com/album/4iNygnVHbGb5OkW9mhwM63?si=fLf1q2QQThmVfuyrZnEA8Q

I wish you all a successful week ??

Christian

Bojan Jela?a

Extended Reality Expert & Lead Software Architect at Zühlke Group

1 天前

My favorite article in the series so far

回复
Burair Zaidi

Mobile App Developer | React Native, SwiftUI, Flutter, iOS, Kotlin | Scalable & High-Performance Apps | 7+ Years Experience

3 天前

Very informative

Christian Moser

Leading the Agentic AI Revolution | Partner & Chief Digital Experience at Zühlke | President of UX Schweiz | Transforming Industries as Thought Leader, Author & Keynote Speaker

3 天前
回复

要查看或添加评论,请登录

Christian Moser的更多文章