AI agents explained
Artificial Intelligence (AI) is slowly transitioning from being a futuristic concept to an integral part of modern enterprises. At the vanguard of this development are AI agents – autonomous software systems capable of performing complex tasks without continuous human guidance. The AI hype train is guilty of setting unrealistic expectations of AI agents in the short term, overlooking the complexities of real-world deployment, particularly in regulated industries. Nonetheless, beneath the hype there is substance, and AI agents are set to make a significant impact in the coming years.
Agents leverage a combination of AI models and standard algorithms to interpret information, make decisions, and execute tasks on behalf of their human operators. They ?extend the capabilities of existing technologies, such as applications, databases and automated processes, by enabling them to work with unstructured infromation and reason under uncertainty. As described in a July 2024 McKinsey report, this gives them the potential to automate a wide range of manual tasks that are beyond the reach of conventional technologies. In September 2024, Salesforce unveiled a suite of agents designed to handle customer tasks across a variety of functions and set a bold ambition to deploy one billion agents by the end of 2025. Just this week, Microsoft announced new autonomous agent capabilities in Copilot Studio and their Dynamics 365 suite, promising the transformation of business processes with AI.
Recent advances in generative AI (GenAI) have been a significant step forward in the viability of AI agents, unlocking powerful new communication, reasoning, and creativity capabilities. These models are now being embedded into new technology frameworks for building autonomous agents, with future advances expected to introduce increasing levels of sophistication in agent behaviours. While agent technologies are nascent, they are maturing rapidly and it is important to understand their underlying principles, capabilities and limitations, so that you can both take advantage of them and manage their risks.
Key principles
There are three essential qualities possessed by AI agents, which, when combined, distinguish them from other types of software and AI models. As we shall see, each of these qualities is a continuum, and the resulting degree of agency can vary widely. There is a vast difference between the agents of today (in most cases, customised chatbots) and our robot colleagues of the future!
Autonomy
Autonomy refers to the ability to operate independently without continuous supervision and is a term that is synonymous with agents (i.e., “autonomous agents”). Autonomy empowers agents to function self-sufficiently within their environments, reacting to events as they occur, deciding which actions to take, performing those actions and handling errors.
The limits of an agent’s autonomy are determined by its faculties for interacting with the environment it is deployed in – i.e., the sensory information available to it and the actions it is able to perform. These limits are sometimes described as the interaction space or action space that the agent operates in. The nature of the interaction might vary from receiving and processing data through an API, to reading and responding to chat messages, all the way to driving an autonomous vehicle or performing a job.
Goal-orientation
Goal-orientation is a fundamental characteristic that propels agents to act purposefully within their environments. It enables agents to make decisions and plan actions in pursuit of their assigned objectives. In artificial agents, goal-orientation is crucial for achieving autonomy. It allows AI systems to operate independently, make informed decisions without constant human intervention and adapt to new situations while staying aligned with their goals.
For AI agents, goals can be explicitly programmed, assigned through policy, or learned through interactions with their environment. In modern chatbots, goals are grounded during the training process and refined using policy, referred to as the system prompt. In reinforcement learning, agents have pre-programmed goals that they learn to achieve by maximising cumulative rewards through trial and error. This learning process involves evaluating the outcomes of actions and adjusting strategies to improve future performance, showcasing a dynamic form of goal pursuit.
Skill
It is important to understand that autonomy and goal-orientation are necessary but not sufficient conditions for AI agency. An an example, software agents are long-running computer programmes that automate the flow and processing of information; they have been in use for decades and are the backbone of the modern IT enterprise. Software agents are autonomous and arguably goal-oriented, but they are not intelligent.
The difference between AI agents and other types of software is that their decision logic – how they choose to act in any given situation – is at least partially driven by AI models. This translates to an ability to take informed actions under uncertainty. A more intuitive way of understanding this is that agents need to have skill – whether that skill is in acquiring knowledge, answering questions, solving problems, creating a project plan or driving a car. The levels of autonomy and impact that are achievable by artificial agents are predicated on the skills that are available to them and, as we will see, there is still a long way to go!
AI agents have major limitations, at least for now
Large Language Models have unlocked some significant breakthroughs that are fuelling the excitement around AI agents. Specifically, they have given computers the ability to interpret and generate different types of unstructured information (text, vision, sound) in remarkably powerful ways and use verbal reasoning to make decisions. The advantages that these faculties might give to an artificial agent are obvious.
Despite these advances and the accompanying hype, the capabilities of foundation models are still very limited and lack many basic faculties that human agents take for granted. For example, they have no working memory, they cannot make complex plans and arguably, they struggle with abstract reasoning. Put more fundamentally, they have relatively few skills and no ability to learn new ones on the fly, and this inhibits their ability to interact with the world like a human agent would.
That said, many of these limitations are being actively addressed by AI researchers and technology companies and we are going to see a steady growth in the potential of AI agents over time. Earlier this week, AI lab Anthropic released an experimental feature that allows its LLM to use a computer. This gives a taste of things to come, although, as Meta’s Yann LeCun likes to say, it is likely to take years for agents to be as smart as his cat. In the meantime, practitioners are combining AI models with traditional software programmes to work around some of the limitations and deploy agents that can perform useful work.
Agentic architecture
The term agentic architecture is used to describe the integration of AI models, in particular LLMs, into proprietary business applications. Using AI models as components enables developers to equip their systems with more advanced reasoning capabilities and skills, increasing the agency of business process automation and enabling entirely new types of IT system to be created. As these architectures mature, LLMs are gradually being added to the standard IT toolbox alongside traditional software, middleware and databases. This is necessitating a shift in skills, development processes and risk management approaches. So what are some of the applications of agentic architecture?
领英推荐
Enterprise chatbots
In an enterprise setting, most of the potential of chatbots lies in their ability to work with proprietary data. Integrating this data involves techniques like Retrieval Augmented Generation (RAG) and Prompt Engineering to retrieve relevant information—such as documents, emails, or application data—and construct prompts that enable LLMs to answer specific queries. This process often requires multiple LLM interactions behind the scenes, orchestrating different models alongside application logic, database calls, and API calls to address a single query.
These techniques are enabling businesses to deploy agents that drive proprietary business value – for example by enhancing product support for customers or enabling staff to efficiently query an internal knowledge base. However, getting them to work reliably and safely is proving more difficult than many people anticipate, not only because of the technical complexities, but also due to the new testing, compliance and risk management demands they introduce.
From chatbots to business processes
Machine learning models are now widely used to automate business processes like fraud detection, medical imaging, document processing, sentiment analysis and biometrics. Typically, a specialised data science team engineer and train a model to excel at a specific task, then deploy it to perform that task repeatedly (so-called narrow AI). What has changed with foundation models is the ability to deploy them for a wide range of tasks without the need for retraining, sometimes replacing custom ML models that took years to develop.
As AI technologies mature and businesses gain confidence, we are seeing LLMs being lifted out of the chatbot arena and integrated into business processes. In doing so, they offer general-purpose abilities to work with multi-modal content and use verbal reasoning to make decisions. This is useful for business processes involving uncertainty and unstructured information, such as facilitating communication, processing documents, managing complex workflows and building knowledge bases.
Technology firms are recognising this potential and offering tools to assist businesses in building agents for process automation. Platforms such as Amazon’s Agents for Bedrock, Microsoft’s Copilot Studio and Salesforce’s Agentforce allow users to create agents that integrate LLMs with user communications and proprietary data and logic using custom workflows. Nonetheless, concerns about reliability, compliance and risk management still apply, especially in regulated industries.
Multi-agent synergies
The parallels between human and artificial agents are inspiring new agentic architectures that unlock synergies in groups of agents working together. The first theme is division of labour, where agents developed or customised for a specific set of subtasks are combined together, creating economies of scale. A good example of the concept being applied is given by MIT researchers, who created a system for scientific discovery by combining multiple artificial agents performing roles such as Ontologist, Planner, Critic, Assistant and Scientist. Even for chatbots, it is common to have different agents performing specialised roles in the question answering process, such as formulating the right question, retrieving the most relevant information, constructing an answer and validating its veracity.
The second theme is the wisdom of the crowd. Here, the idea is that aggregating the knowledge of multiple agents together creates higher performance than a single agent working alone, by incorporating diverse perspectives and compensating for individual’s mistakes. To this end, use of model populations is already a common tool in AI (for example, ensemble methods, swarm intelligence and genetic algorithms) and is now also being applied to LLMs in agentic architectures.
There is a special risk associated with multi-agent architectures that is important to mention here: ?anthropomorphising AI agents such that IT systems are built to resemble human social structures, resulting in architectures that are extremely difficult to predict and very expensive to run. The science of collective intelligence is not developed enough to identify when multi-agency is optimal and in the absence of definitive guidelines, there can be a tendency for developers to remodel what they see around them.
How are agents controlled?
Managing the risks of AI models and agents is a rapidly evolving field attracting increasing regulatory attention, particularly in the EU and regulated industries. Even for organisations outside its purview, the EU AI Act offers useful guidance on the kinds of controls needed for responsible AI use, especially for systems that could cause social or economic harm. These controls include data governance, documentation, transparency, human oversight, and robustness. They aim to ensure developers and deployers of AI systems, including agents, act diligently and responsibly without overly constraining system design or usage. Many companies are adopting AI risk management and compliance regimes along these lines, sometimes complementing them with ethical policies that align AI use with their values and mitigate reputational risk.
At a technical level, controlling an agent's interaction space—such as the datasets, APIs, applications, communication systems, and people it can access—is crucial. Agents should be deployed with well-defined access permissions and privileges reviewed by human operators. Policies can set goals and constraints for agent behaviour, but various forms of testing are necessary to validate alignment. Human-in-the-loop processes supervise agent behaviours and ensure decision-making accountability, which is particularly important when dealing with immature or poorly understood technologies. Continuous monitoring and post-deployment audits help detect and correct unintended behaviours over time, enhancing safety and compliance.
Will AI agents take over human jobs?
I recently participated in a conference panel during which an audience member asked the panel whether AI agents would replace human workers. One of the panel members replied by confidently reassuring the audience that AI would never take their jobs and would only ever be a tool to augment their productivity. I was quietly horrified! Digital automation has been disrupting the workforce for 50 years – indeed creating jobs, but also displacing many people into lower paid work[i]. AI is likely to continue this trend and may even accelerate it.
However, it will take time, as integrating AI into the fabric of the economy is likely be constrained not by the technology itself, but by the pace at which society can absorb it. As Sam Altman himself said, AI will “change the world much less than we all think and it will change jobs much less than we all think". Hopefully, AI agents can be focussed on important roles that humans tend to dislike performing or are not very good at. Perhaps AI agents can be used to liberate the human workforce from bureaucracy. Let’s hope so! ?In the meantime, as outlined above, there are serious challenges to overcome, both technical and organisational, before AI agents are capable of replacing human workers at scale.
[i] Acemoglu, Daron, and Pascual Restrepo. "Tasks, automation, and the rise in US wage inequality."?Econometrica?90.5 (2022): 1973-2016.
Strategy ∩ Product ∩ Growth
4 个月Great piece! Good to see tangible "general audience" explanations in a time of hype and mystery. Hope you're enjoying being back in London too
Co-founder at orra
4 个月The unpredictability and unreliability of multi-agent architectures is a significant concern. As you mentioned, anthropomorphising AI agents can lead to complex systems that are difficult to manage and costly to operate. It's essential to approach these designs with caution and prioritise robust control mechanisms to mitigate risks.
#GrowthHacker
4 个月Excellent article; thanks for sharing. I'm already working on a concept that uses the advanced method that you described, like multiple layers of memory at each agent, and they are not only collaborating but using multiple AI models to have better answers. So far, the results are blowing my mind. I started the automation on an executive level, which will never happen at a traditional company. However, the most significant savings could come from those workforces, not from releasing the cleaning lady.
Programme Manager at Bank for International Settlements – BIS
4 个月Very well written Chris!