Stateful and Responsible AI?Agents
Debmalya Biswas
AI/Analytics @ Wipro | x- Nokia, SAP, Oracle | 50+ patents | PhD - INRIA
Introduction to AI?Agents
The discussion around ChatGPT, has now evolved into AutoGPT. While ChatGPT is primarily a chatbot that can generate text responses, AutoGPT is a more powerful and autonomous AI agent that can execute complex tasks, e.g., make a sale, plan a trip, make a flight booking, book a contractor to do a house job, order a pizza.
Bill Gates recently envisioned a future where we would have an AI agent that is able to process and respond to natural language and accomplish a number of different tasks. Gates used planning a trip as an example.?
Ordinarily, this would involve booking your hotel, flights, restaurants, etc. on your own. But an AI agent would be able to use its knowledge of your preferences to book and purchase those things on your?behalf.
AI agents [1] follow a long history of research around Multi-agent Systems (MAS) [2], esp., Goal oriented Agents [3]. However, designing and deploying AI agents remains challenging in practice. In this article, we focus on primarily two aspects of AI agent platforms:
Agent AI Platform Reference Architecture
In this section, we focus on identifying the key components of a reference AI agent platform:
Given a user task, the goal of an AI agent platform is to identify (compose) an agent (group of agents) capable to executing the given task. So the first component we need is an orchestration layer capable of decomposing a task into sub-tasks, with execution of the respective agents orchestrated by an orchestration engine.
A high-level approach to solving such complex tasks involves: (a) decomposition of the given complex task into (a hierarchy or workflow of) simple tasks, followed by (b) composition of agents able to execute the simple(r) tasks. This can be achieved in a dynamic or static manner. In the dynamic approach, given a complex user task, the system comes up with a plan to fulfill the request depending on the capabilities of available agents at run-time. In the static approach, given a set of agents, composite agents are defined manually at design-time combining their capabilities.
This implies the existence of an agent marketplace / registry of agents?—?with a well-defined description of the agent capabilities and constraints. For example, let us consider a house painting agent whose services can be reserved online (via credit card). Given this, the fact that the user requires a valid credit card is a constraint, and the fact that the user’s house will be painted within a certain timeframe are its capabilities. In addition, we also need to consider any constraints of the agent during the actual execution phase, e.g., the fact that the agent can only provide the service on weekdays (and not on weekends). In general, constraints refer to the conditions that need to be satisfied to initiate an execution and capabilities reflect the expected outcome after the execution terminates. Refer to [4] for a detailed discussion on the discovery aspect of AI agents.
Given the need to orchestrate multiple agents, we also the need for an integration layer supporting different agent interaction patterns, e.g., agent-to-agent API, agent API providing output for human consumption, human triggering an AI agent, AI agent-to-agent with human in the Loop. The integration patterns need to be supported by the underlying AgentOps platform.
Andrew Ng recently talked about this aspect from a performance perspective:
Today, a lot of LLM output is for human consumption. But in an agentic workflow, an LLM might be prompted repeatedly to reflect on and improve its output, use tools, plan and execute multiple steps, or implement multiple agents that collaborate. So, we might generate hundreds of thousands of tokens or more before showing any output to a user. This makes fast token generation very desirable and makes slower generation a bottleneck to taking better advantage of existing?models.
To accommodate multiple long-running agents, we also need a shared memory layer enabling data transfer between agents, storing interaction data such that it can be used to personalize future interactions.
Finally, the governance layer. We need to ensure that data shared by the user specific to a task, or user profile data that cuts across tasks; is only shared with the relevant agents (authentication & access control). We further consider the different Responsible AI dimensions in terms of data quality, privacy, reproducibility and explainability to enable a well governed AI agent platform.
Stateful Agent Monitoring
Stateful execution [5] is an inherent characteristic of any distributed systems platform, and can be considered as a critical requirement to materialize the orchestration layer of an AI agent platform.?
Given this, we envision that agent monitoring together with failure recovery will become more and more critical as AI agent platforms become enterprise ready, and start supporting productionized deployments of AI agents.
However, monitoring AI agents (similar to monitoring large-scale distributed systems) is challenging because of the following reasons:
To summarize, AgentOps monitoring is critical given the complexity and long running nature of AI agents. We define agent monitoring as the ability to find out where in the process the execution is and whether any unanticipated glitches have appeared. We discuss the capabilities and limitations of acquiring agent execution snapshots with respect to answering the following types of queries:
We outline the AI agent monitoring approach and solution architecture in the next section.
AI Agent Monitoring Architecture and Snapshot Algorithm
We assume the existence of a coordinator and log manager corresponding to each agent as shown in the below figure. We also assume that each agent is responsible for executing a single task / operation.
领英推荐
The coordinator is responsible for all non-functional aspects related to the execution of the agent such as monitoring, transactions, etc. The log manager logs information about any state transitions as well as any messages sent/received by the agent. The state transitions and messages considered are as outlined in the below figure:
We assume that the composition schema (static composition) specifies a partial order for agent operations. We define the happened-before relation between agent operations as follows:
An operation a happened-before operation b (a → b) if and only if one of the following holds?
An operation, on failure, is retried with the same or different agents until it completes successfully (terminates). Note that each (retrial) attempt is considered as a new invocation and would be logged accordingly. Finally, to accommodate asynchronous communication, we assume the presence of input/output (I/O) queues. Basically, each agent has an I/O queue with respect to its parent and component agents?—?as shown in Fig. 2.
Given synchronized clocks and logging (as discussed above), a snapshot of the hierarchical composition at time t would consist of the logs of all the “relevant” agents until time t.
The relevant agents can be determined in a recursive manner (starting from the root agent) by considering the agents of the invoked operations recorded in the parent agent’s log until time t. If message timestamps are used then we need to consider the skew while recording the logs, i.e., if a parent agent’s log was recorded until time t then its component agents’ logs need to be recorded until (t + skew). The states of the I/O queues can be determined from the state transition model.
Responsible AgentOps
The growing adoption of Generative AI, esp. with respect to the adoption of Large Language Models (LLMs), has reignited the discussion around Responsible AI— to ensure that AI/ML systems are responsibly trained and deployed.
The table below summarizes the key challenges and solutions in implementing responsible AI for AI agents vs.?
We expand on the above points in the rest of the article to enable an integrated AgentOps pipeline with responsible AI governance.
Data Consistency: The data used for training (esp., fine-tuning) the LLM should be accurate and precise, which means the relevant data pertaining to the specific use-case should be used to train the LLMs, e.g. if the use case is to generate summary of a medical prescription?—?the user should not use other data like Q&A of a diagnosis, user must use only medical prescriptions and corresponding summarization of the prescription. Many a times, data pipelines need to be created to ingest the data and feed that to LLMs. In such scenarios, extra caution needs to be exercised to consume the running text fields as these fields mostly hold inconsistent and incorrect data.
Bias/Fairness: With respect to model performance and reliability, it is difficult to control undesired biases in black-box LLMs, though it can be controlled to some extent by using uniform and unbiased data to fine-tune the LLMs and/or contextualize the LLMs in a RAG architecture.
Accountability: To make LLMs more reliable, it is recommended to have manual validation of the LLM's outputs. Involving humans ensures if LLMs hallucinate or provide wrong response, a human can evaluate and make the necessary corrections.
Hallucination: In case of using LLM APIs or orchestrating multiple AI agents, hallucination likelihood increases with the increase in the number of agents involved. The right prompts can help but only to a limited extent. To further limit the hallucination, LLMs need to be fine-tuned with curated data and/or limit the search space of responses to relevant and recent enterprise data.
Explainability: Explainability is an umbrella term for a range of tools, algorithms and methods, which accompany AI model inferences with explanations. Chain of Thought (CoT) is a framework that addresses how a LLM is solving a problem. CoT can be primarily implemented using two approaches:
Conclusion
Agentic AI is a disruptive technology, and there is currently a lot of interest and focus in making the underlying agent platforms ready for enterprise adoption. Towards this end, we outlined a reference architecture for AI agent platforms. We primarily focused on two aspects critical to enable scalable and responsible adoption of AI agents?—?an AgentOps pipeline integrated with monitoring and Responsible AI practices.
From am agent monitoring perspective, we focused on the challenge of capturing the state of a (hierarchical) multi-agent system at any given point of time (snapshot). Snapshots usually reflect a state of a distributed system which “might have occurred”. Towards this end, we discussed the different types of agent execution related queries and showed how we can answer them using the captured snapshots.
To enable responsible deployment of agents, we highlighted the Responsible AI dimensions relevant to AI agents; and showed how they can be integrated in a seamless fashion with the underlying AgentOps pipelines. We believe that these will effectively future-proof Agentic AI investments and ensure that AI agents are able to cope as the AI agent platform and regulatory landscape evolves with time.
References
Data Scientist | Machine Learning | Data Science Trainer | Data Engineering
2 个月Clearly written and concise article with valuable insights, thanks for sharing Debmalya Biswas
AI/Analytics @ Wipro | x- Nokia, SAP, Oracle | 50+ patents | PhD - INRIA
3 个月The full article is now published in AI Advances https://medium.com/ai-advances/stateful-and-responsible-ai-agents-7af386268554
Senior Managing Director
3 个月Debmalya Biswas Very Informative. Thank you for sharing.
GenerativeAI | Computer Vision | Data Analytics | Innovator | Product | Strategy
3 个月agent in serverless environment, while is extremely economical, managing its state and context is a challenge, particularly in multi-user and multi-transactional environment. This is a good article to develop and build highly parallel and highly scalable architecture to support multi LLM with multiple users while maintaining the context of each user transaction.
Practice Director-UnitedLayer, Cloud, BSM, Observability & AIOps, Digital Transformation |ISB Chief Technology Officer | IIMBG AI/ML Business Analytics
3 个月Good Insight Deb !!!