Agentic Systems : A Business Perspective
Created with Midjourney

Agentic Systems : A Business Perspective

Authors Note: This is an article jointly developed and written with Erik Goelz , a leading expert on marketing AI products. Erik and I have worked together at several prominent AI companies, and at several technology inflection points over the past decade, including the rise of both RPA and large language models as cornerstone enterprise technologies. We would love to hear from you, so please feel free to contact us both directly with thoughts and feedback.

Generative AI has entered its next phase: the era of AI Agents.?

For enterprise CIOs and AI leaders, this means new pressure and urgency from their boards and CEOs to implement these new AI tools ahead of their competition. I often speak with and advise these executives on AI strategy and trends, and lately during these discussions, I hear the same two questions from every executive I speak to.

  • What exactly is an “AI Agent”? Depending on who you ask, everyone seems to have a different definition.
  • Should enterprises buy or build AI Agents?

These are complex questions and the purpose of this article is to provide answers to these questions with a practical framework for enterprise leaders to evaluate and implement these tools to improve productivity.

tl:dr summary: There are multiple types of AI agents, but the biggest productivity potential comes from an emerging category of AI tools that can automate complex, multi-step, outcome-based workflows. These ‘Agentic AI systems’ enable the autonomous completion of complex tasks and workflows, providing orders of magnitude more productivity potential compared to legacy agents such as task automation and generative AI chatbots. However, these tools also introduce significant technical and operational complexity for enterprises. Building them requires expertise and optimization across every layer of the AI stack, and there will be competition among hundreds of different enterprises to hire these rare and in-demand AI leaders. Additionally, the cost to build these tools is soaring, with some estimates approaching $200M for end-to-end solutions. These factors essentially require enterprises to become a state-of-the-art AI product companies, and most will be better served buying off-the-shelf solutions, rather than trying to build these capabilities internally.

What is an AI Agent?

To cut through the noise, sometimes it’s best to refer to the classics. “Artificial Intelligence: A Modern Approach ”, one of the AI industry’s foundational textbooks, defines “an agent as anything that can be viewed as perceiving its environment through sensors and acting on its environment through actuators”.?

Put differently, an AI agent is software that:

  • Takes an input (such as text or visual data)
  • Makes a decision (using an algorithm such as a large language model)
  • Takes an action (such as analyzing data, clicking a file, or generating content such as code)

So, what is not an AI Agent? Smart people can debate on where to draw the line, but from our perspective, here are a few things which don’t meet the criteria. Standalone machine learning models, many of which are highly sophisticated but have no means to gather input and take action, are not AI Agents. Additionally, traditional software and SaaS tools, which take input from users and take action based on its code, but lack any sophisticated intelligence for making decisions, are also not AI Agents.

Using this as a guiding framework, we define three distinct types of AI Agents that have very different applications for enterprise productivity.

AI Agent Type #1: Task Automation for reducing costs

The first type of AI Agent is task automation. Task automation technologies, such as RPA, work by automating basic tasks across a computer’s UI, such as mouse clicks, copy and paste, extracting or adding data into specific software fields, or identifying information from documents that have been scanned and digitized.?

Task automation technologies gained popularity in the late 2010s, growing to an estimated $4.3B market . One common use case is high volume processing of documents with a standardized format, such as an invoice. In this example, the relevant data, such as the account number and invoice amount, are always in the same location, and can easily be extracted and added into the relevant AP system. Another use case is the repetitive copy and paste of specific data from Excel to add into an ERP system, which often needs to be done as part of a larger workflow. Task automation really shines when automating these highly repetitive tasks that need to be done thousands or even millions of times a year by some enterprises. By automating these repetitive tasks, task automation focuses on enabling enterprises to reduce costs and headcount by up to 30%.

Task automation is purpose built to automate at the UI layer, and excels at UI-based input and actions. Where it falls short is its decision framework, as task automation tools lack true intelligence to handle complexity or uncertainty. This is because task automation is implemented using deterministic, step-by-step, rules-based workflows that must be built manually by developers. In the real world, digital work and processes are full of complexity and permutations, which can lead to limitations when using task automation tools such as RPA.?

AI Agent Type #2: Generative AI chatbots for improving employee productivity

The second type of AI agent is generative AI chatbots, such as Open AI’s ChatGPT and Anthropic’s Claude. These generative AI chatbots rose to prominence following the release of ChatGPT, famously the fastest adopted software in history. The key innovation of these chatbots is large language models (LLMs) that can understand the context of language (and increasingly other modalities) and handle a wide range of language tasks. Generative AI chatbots are also notable for being able to understand and act on natural language instructions from users. In fact, state of the art LLMs have demonstrated performance matching or even surpassing human baselines on advanced language understanding benchmarks such as the Massive Multitask Language Understanding (MMLU) benchmark.?

Figure 1 - MMLU: average accuracy over time. Image source: Artificial Intelligence Index Report 2024 from Stanford University HAI

Despite some initial hype for broader applications, use cases for these generative AI chatbots have focused on employee driven ‘information work’ tasks, such as creating draft content and documents, personalizing content and creating images for marketing, knowledge management including analyzing, extracting insights, or summarizing text, video, or image data, assisting with customer service tasks, and coding.

Figure 2- Enterprise use cases for generative AI chatbots. Image source:


Figure 3 - Enterprise use cases for generative AI chatbots. Image source: Artificial Intelligence Index Report 2024 from Stanford University HAI

The productivity potential for generative AI chatbots is tremendous: according to a 2023 study from Microsoft , these Generative AI chatbots can improve employee productivity on these tasks by between 20-70%. Additionally, according to a Harvard Business School study , consultants using GPT-4 improved productivity by 12% and quality by 40%.

This impressive productivity increase has led to an estimated $36B market for generative AI chatbots, less than two years after the release of ChatGPT. At the same time, several limitations of these tools have also become apparent.

Performance increases on advanced benchmarks have slowed, suggesting diminishing returns on improving productivity simply by training a better model.?

Additionally, while generative AI chatbots have demonstrated huge productivity increases on information work tasks, they have not demonstrated the same success automating more complex workflows involving multiple steps and actions. Many generative AI chatbots have introduced new capabilities to address this gap such as API connections and the ability to use functional calls to integrate and act on external software and tools. However, results have not been promising. In several recent papers (GAIA , OSWorld , WorkArena ), state of the art large language models only achieved around a 15% success rate on more complex, outcome-based tasks. There are several different causes cited in these studies for the low success rate, with some papers citing reasoning and planning challenges, with others pointing to gaps in the models’ ability to interface with the UI. When enterprises have attempted to automate these more complex workflows with generative AI chatbots, they have sometimes been met with negative or undesirable results .

Ultimately, these challenges and limitations have resulted in less than expected adoption of generative AI chatbots in enterprises. Despite 2023’s initial hype wave, only 33% of enterprises have adopted generative AI according to the 2024 Artificial Intelligence Index Report from Stanford HAI .

Figure 4 - adoption of generative AI in enterprises. Image source: Artificial Intelligence Index Report 2024 from Stanford University HAI

AI Agent #3 Agentic AI systems for autonomous completion of complex, outcome based tasks

The third type of AI agent, which we will call Agentic AI Systems, are systems that can autonomously complete full processes or workflows by focusing on the outcome and not just individual tasks.?

To explain this further, let’s compare the Task Automation and Generative AI chatbots to these Agentic AI Systems.

Task automation and generative AI chatbots work by addressing individual tasks, such as copying and pasting from excel to an ERP system, or summarizing a document. They do this based on very specific instructions from the user, defined in a rules-based workflow or through natural language in a chat interface.?

Agentic AI systems, on the other hand, don’t focus on any individual task, but rather the goal the user wants to achieve. This can often involve multiple steps in a workflow, and non-deterministic methods to achieve the goal.

To illustrate this further, let’s consider an example of an enterprise seller that wants to create an account plan. Sellers need to do external research on websites and financial documents, review internal documents and data such as communication history and other CRM datapoints, ultimately combining everything into a specific template for review and discussion with leadership.

This is unlikely to be achievable with a task automation system simply because of the complexity and ambiguity involved. To try completing this with a generative AI chatbot would require the user to manually complete multiple different tasks then combine the outputs together. It would save some time, but still require heavy effort from the user.

However, for agentic AI systems that can coordinate across these tasks to achieve the users intended goal and outcome, the productivity potential for these more complex tasks is immense.

Back to the account planning example: Let’s say each account plan takes a seller 5 hours each. Assuming this is done quarterly for 30 different accounts, this means a total of 600 hours each year spent on account planning. This is 25% of a full time, 40 hour a week job!

Now image if you could automate the heavy workload of research and planning for each of these account plans, only requiring the seller to spend 5 minutes starting the workflow with the Agentic AI System plus another 25 minutes reviewing and adjusting the plan.?

This equates to a 90% reduction of time and effort, saving the enterprise seller 540 hours per year, that could be spent on meeting with customers, closing deals, and driving revenue.

Based on the time saved alone, this represents a 10x productivity multiplier, orders of magnitude more impactful than the 20-70% or so productivity improvement from generative AI chatbots. Doing some back of the napkin math, all of a sudden our $45B market size for generative AI chatbots becomes a nearly $1T market for agentic AI systems.

So, what is preventing enterprises from seizing this $1T market opportunity??

There are both technical and operational challenges with implementing these Agentic AI Systems.?

Technical Challenges with Agentic AI Systems

From a technical perspective, solving for outcomes instead of tasks introducing significant complexity. Back to our account planning example, an Agentic AI System needs to determine and action a step-by-step plan, analyze multiple types of complex data sources such as websites and CRM systems to find relevant and specific data, and interface across multiple different UI, such as websites and operating systems. Then, it ultimately needs to package all of its analysis into a specified format.

Building this Agentic AI Systems is significantly more complex than generative AI chatbots which can be built by combining a large language model API within a fairly simple UI. Agentic AI Systems require expertise and optimization at every layers of the AI stack:

  • LLMs: Agentic AI Systems often require many different large language models, each focused on different tasks or parts of a workflow.
  • Model Optimization and Tuning: As noted previously, generative AI chatbots and LLMs only deliver about 15% success rate in complex agentic tasks. To overcome this and LLM’s inherent gaps in planning, reasoning, and coordination, each of the LLMs in an Agentic AI System needs to be manually tuned and optimized. In a recent presentation at Sequoia Capital , prominent AI researcher Andrew Ng outlined a multitude of techniques that need to be applied in these scenarios to overcome these gaps in planning, reasoning, and coordination.

  • Multi-agent orchestration: Agentic AI Systems require an orchestration layer to ensure coordination of the different models that make up the system, an area of on-going and open research within the AI community.
  • Retrieval-Augmented Generation (RAG): Agentic AI Systems need to incorporate Retrieval Augmented Generation (RAG). While RAG is not new, it is quickly becoming an increasingly complex field of AI, with dozens of different emerging techniques and approaches .

Figure 5 - Survey of Retrieval-Augmented Generation Techniques Image Source: “Retrieval-Augmented Generation for Large Language Models: A Survey”, March 2024 (


  • Semantic Layer: Agentic AI systems require a dedicated semantic layer to ensure the LLMs not only understand the context of the data, but retain a shared context across each of the different models within the agentic AI system.
  • Infrastructure optimization: Agentic AI Systems also need to be optimized at the infrastructure and hardware layer. This requires a making a set of highly complex tradeoffs across factors such as token/s, cost, batch size, concurrency, and more, depending on the use case.

Operational Challenges with Agentic AI Systems

These technical challenges introduce several operational challenges for enterprises that are considering building agentic AI systems internally.

  • Complexity: As previously stated, the complexity of building a full-stack AI infrastructure with optimizations across each layer of the stack is immense, requiring multiple different tools, technologies, and expertise beyond just a LLM API.
  • Time-to-value: The harsh reality is that most organizations will require 12-18 months to hire the relevant team and build agentic AI solutions. This assumes enterprises can ever hire the necessary talent in the first place. Enterprises need to hire not just developers and AI engineers, but expert AI leaders with previous experience building enterprise scale AI systems before. To do this each enterprise will be competing with 100s of other enterprises to hire from a small pool of only a handful of qualified experts.
  • DIY Cost: One major learning for enterprises over the past 18 months has been just how expensive building state-of-the art AI systems are, even when using pre-trained models. According to a recent analysis by A16Z , expected 2024 spend on LLMs by enterprise is $18M. According to that same analysis, the cost of the LLMs only accounts for 25% of the total cost of deploying a use case. A recent McKinsey analysis states this number even lower, estimating that that LLMs costs only account for 10-15% of the total cost of the solution. In summary, after factoring in the cost of the full solution, Agentic AI Systems could end up costing between $72-180M to build internally.
  • Risk: Even for enterprises with the time, patience, and capital to invest in building full-stack AI solutions, most enterprises will effectively need to become state-of-the-art AI product companies. Most enterprises are not equipped to do this.

Simply stated, because of the complexity, time, cost, and risk involved with developing Agentic AI Systems, the vast majority of enterprises are better off buying these tools and partnering with an external product company.

As we prepare to unveil our agentic automation and orchestration platform at Neonomic, we will be digging deeper into the user experience, toolchain, components and the 30X productivity use cases, in our subsequent posts.


Tushar Sonak

Director of Engineering | Build Engaged High-Performing Teams | Technology Innovator | Healthcare

5 个月

PD, this is an excellent summary of the state of Gen AI.? I have been experimenting with Agentic AI tools like CrewAI and Autogen 2.0 and agree that getting agents to perform actions is complex, especially when the task requires making changes in external systems.? RPA tools, when used in conjunction with Agentic AI, can provide the building blocks of 'micro-actions' that the orchestrating agent (manager) can use to build workflows with helpful business outcomes. I look forward to checking out the Neonomic platform when it is unveiled. Wishing you all the best, Tushar

Nic Surpatanu

Smarter, faster business decisions and operations: transform your data—automating tasks to turn months into minutes—to deliver the insights your business needs.

5 个月

Well said, my friend!

Shail Khiyara

Top AI Voice | Founder, CEO | Author | Board Member | Gartner Peer Ambassador | Speaker | Bridge Builder

5 个月

One of the most comprehensive description on this topic Prabhdeep (PD) Singh. Two thoughts not addressed here. In my humble opinion, botarchitecture is the critical weakness in developing effective and efficient AgentAI, barely permitting any progress. And, the topic of agenticAI governance is the least discussed out there. While there’s incredible promise, adoption is hinged imho on governance. RPA hype plays, leave a landmine of lessons learned that can be applied effectively to successfully and responsibly drive adoption of AgenticAI.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了