AI Agents or Agentic Systems
Mahtab Syed
Data and AI Leader | AI Solutions | Cloud Architecture(Azure, GCP, AWS) | Data Engineering, Generative AI, Artificial Intelligence, Machine Learning and MLOps Programs | Coding and Kaggle
In the new year 2025 we see everyone talking about “Agents” or Agent like systems called “Agentic Systems”.
I recently read a post with title “SaaS apps are dying, and Agents will take over”. This is just another clickbait, and I don’t agree to this. The Agents are not ready for this yet (as of Mar 2025), but this will change.
So, let’s describe Agents and the production readiness, in simple words.
What’s an AI Agent?
An AI agent is a system that perceives its environment, processes information, and takes actions to achieve specific goals autonomously or semi-autonomously. E.g.
1.??? My Personal Travel Agent to which if I give this chat command “Can you book me a 1 week vacation to Italy in Sep 2025”. This can invoke the AI Agent for these Tasks
a.??? Search best flight tickets in any week of Sep 2025
b.??? Search best hotel accommodation in days between flights
c.???? Find best places to visit and book its tickets
d.??? Send email to my work manager at work to approve the holiday, and wait for confirmation
e.??? And on confirmation book flights, hotel and places of interest after confirming with me in a chat (so it’s a semi-autonomous agent where I am the Human Reviewer). Note this involves Financial transactions which I have authorised this Agent to do
f.????? Block my work calendar and put out an out of office
g.??? Keep an eye on any flight / hotel changes, cancellation and update after confirming with me
h.??? Etc
2.??? My Software Engineer Agent which takes a chat like, “In the GitHub repository "TheBestCasuals - Retail for casual clothes” please replace current search feature with a new search feature which is based on latest Semantic Search technology using RAG and a Vector Db”, This can be another semi-autonomous Agent which will
a.??? Get the repository “TheBestCasuals - Retail for casual clothes”and understand the full code
b.??? Find current Fixed Search implementation
c.???? Replace this with the new logic of SemanticSearch by suggesting me the best LLM and Vector DB to use by writing all Dev and Test code
d.??? Write best Evaluation metrics
e.??? Creating all changes as multiple GitHub Pull Requests and send for my Review
f.??? After I review the Code is published to the QA Env
g.??? After I approve the Code is published to the Prod Env
?
What does an Agent comprise of?
Agents or Agentic Systems have an Agent Runtime which is the environment where an agent operates. It includes the Orchestration, Model, Tools, and Memory (short-term and long-term). It manages the execution of the agent, including how it takes in information, reasons about it, and uses tools to interact with the world.
Agent Runtime
1.??? Orchestration - a cyclical process that governs how the agent takes in information, performs some internal reasoning, and uses that reasoning to inform its next action or decision.
2.??? Model - centralised decision maker for agent processes like the latest reasoning based LLMs
3.??? Tools – access external data and services via Functions, Extensions, Data Stores and Plugins, Rest APIs etc.
4.??? Memory - Agents manage memory using two types: Short-term and Long-term.
a.??? Short-term memory: This is immediate and context-driven, like remembering the current stage of a conversation or the last tool used.
b.??? Long-term memory: This stores broader knowledge and learned patterns, similar to how a model's training data provides it with a foundation of understanding. Long-term memory enables an agent to draw on past experiences and apply them to new situations, enhancing its reasoning and decision-making.?
?
Now the most important question :)
If Agents are so good what’s stopping us from using them for most day to day tasks?
As Chip Huyen explains in her talk there a “Curse of complexity”. As the no of steps in a Task increases the probability of failure increases exponentially.
?
The probability of failure of an AI Agent is directly proportional to the number of steps it involves
1.??? A relatively simple task using an AI Agent comprises of at least 5-10 steps. At the moment in Mar 2025, the LLM models can fail with a probability of approximately 2% for 1 step task, 18% for a 10 step task and 87% for a 100 step task.
2.??? So the probability of failure of a 10 step task which is about 18% is not an acceptable probability of failure for any production grade task. The above example of “My Personal Travel Agent” is more than 10 steps which means its probability of failure is more than 20%. And this Agent involves finances, booking tickets, booking hotels, so we are not ready yet.
?
The good news is LLMs are evolving:
1.??? LLMs are getting better at planning and reasoning,
2.???They can use reliable tools and
3.??? And they are getting much larger context window.
So, it’s a matter of time when Agents or Agentic Systems can be used for solving day to day production grade tasks which are slow and costly today…
?
Reference and acknowledgements:
1 - Agents Google Whitepaper by Julia Wiesinger, Patrick Marlow and Vladimir Vuskovic - https://www.kaggle.com/whitepaper-agents
?2 - Chip Huyen
?
- Mahtab Syed, 10 Mar 2025, Melbourne
?
PhD in Public Health | MD in Clinical Medicine | MPH in Public Health | Mixed qualitative and quantitative methods | RCT & Cohort
1 周Well explained very insightful!