Next Gen AI for Enterprises
Boris Villazon-Terrazas, PhD
Global Gen AI CoE Leader | Europe West AI Innovation Leader | AI & DS Product Manager | CAIO | CTO | Mentor | People Empowerment | 14 AI & DS patents
Executive Summary
To illustrate the power of future Enterprise Grade Artificial Intelligence solutions, imagine an executive in Alpha company, Bob, who asks Alpha’s FP&A Trained AI system, “Which country in the EMEIA region will have the highest revenue this quarter, and why?” To answer Bob’s question, Alpha’s FP&A AI solution sequences and deploys a set of “Agents” that will perform the following tasks:
(1) Understand the “narrative” of Bob’s question (e.g. identify “Country”, “EMEIA”, “Quarter” and “Revenue” as key question inputs)
(2) Use a Knowledge Graph, to understand the relevant relationships within Alpha (e.g. Which countries belong to Alpha’s EMEIA region?)
(3) Write a query, and retrieve the relevant information from Alpha’s ERP system, using Retrieval-Augmented Generation (RAG)
(4) Deploy a specially trained financial analysis agent to make the relevant revenue calculations, predictions and analysis
(5) Perform a QA process on the result, and capture an audit trail explaining what the system has done
(6) Formulate the result into human understandable language / format
By combining knowledge graphs, with enterprise databases and sequencing the right set of task oriented pre-trained AI agents, the AI system provides Bob with a step-by-step, explainable, audit-trailed, answer to his question, providing insights into the “why”, detailing the relevant country’s main drivers impacting the result. This simple example demonstrates how Generative AI can and will support business decisions in the future.
Overcoming LLM Limitations
When trying to deploy LLMs on an enterprise grade use-case, organizations find challenges in the area of LLMs, tendency to “hallucinate” answers, difficulty with giving it the relevant data and contexts, its high maintenance cost, and the difficulty to explain the result or audit it [1].
We are seeing the rise of “Agentic AI Systems”, that are mitigating these challenges by breaking down each “enterprise grade query”, to multiple agents, responsible for understanding the context, retrieving the relevant information, performing the right calculations, auditing/quality checking, and displaying a trusted answer, to the user.
Modulating the Thinking Process
To achieve institutional-grade Generative AI, it is essential to enhance its cognitive functions for accuracy and complex data analysis. This involves integrating various forms of knowledge, tools and sources: factual knowledge, episodic memory, contextual knowledge, common sense, mathematical reasoning, and metacognition for decision-making. Combining LLMs with Knowledge Graphs (KG-RAG) and AI agents [2] addresses critical business challenges, such as calculating a country’s revenue in the EMEIA region. Metacognition ensures accuracy and self-assessment, while mathematical reasoning calculates and solves for complex data analysis. A strong foundation in language proficiency allows the AI's situational model to amalgamate different parts of the AI system, enabling precise and impactful co-pilot capabilities.
Introducing Agentic AI
Imagine an Enterprise grade AI system as having three main parts: (i) communication, which involves receiving input and understanding it, and the ability to communicate any result in human understandable format; (ii) brain, which involves thinking (i.e. planning and orchestrating the deployment of the right tools in the right sequence); and (iii) action, which involves doing (i.e. the AI tools executing actions and feeding result to the next tool, in sequence) [2].
In this AI system, a growing set of smaller AI agents, that act as specialized pre-trained workers, each with a specific role (plan, extract data, calculate, predict, Audit, display result) are giving the AI system as a whole, the ability to perform more and more complex tasks.
Retrieval Augmented Generation (RAG)
RAG is a capability, that gives an AI system the ability to connect to organizational (and external) data bases and query/look up information [10]. Now, instead of relying on an LLM’s training for “factual knowledge”, the AI system has access to the organization’s up to date, most accurate “version of the truth”. Which in turn, will allow generating answers with improved quality and accuracy [11].
Factual Knowledge
Conversational AI systems, which leverage LLMs such as Transformers, often struggle to seamlessly integrate factual content into their generated language. These systems can benefit from knowledge bases that serve as auxiliary memory [12], offering a reservoir of external information. LLMs can be enhanced by conditioning them on structured information, typically organized in Knowledge Graphs, facilitating the incorporation of this external data into their outputs. A knowledge base is an information repository that provides factual knowledge. It addresses the challenge of LLM hallucinations by providing reliable data to improve the factual accuracy of responses to domain-specific and time sensitive queries [1].
To answer Bob’s question, the AI Agent searches the document library and knowledge graph to retrieve data corresponding to each Country BELONGING_TO the relevant (e.g. EMEIA) Region, using its RAG capability, the AI system will query the relevant revenue details per country, and will in turn feed this data into its “revenue calculation AI tool” to correctly calculate the required revenue figures. By structing the data with a knowledge graph, Bob can make flexible queries not just about total revenue, but also about the relationships between different data points such as how revenue in one country compares to another within the same region.
Episodic Memory and Contextual Knowledge
Bob’s AI system has episodic memory [4]. Initially, he asks, “Which country in the EMIEA region will have the highest revenue this quarter?” (The AI Agent retrieved the historical performance of each country and calculated the revenue with its dedicated calculation tool). When the user follows up with, "How about the 2nd and 3rd highest countries?" The AI System, recalling the previous context [6], provides detailed revenue for each country within the EMEIA region.
Mathematical Reasoning and Contextual Knowledge
The country with the highest revenue in the EMEIA region resulting from the calculation showcases the LLM's ability to integrate mathematical calculations [7] with contextual understanding effectively. By utilizing knowledge graphs that illustrate the nodes or entities such as "Revenue," "Country," and "Region," the LLM can discern how these components interact. For instance, with edges in the graph indicating relationships such as “Country --BELONGS_TO --> Region" and "Country -- GENERATES --> Revenue," the LLM can deduce that a change in the financial results of a country within a region will impact the overall revenue reported for that region. This interconnected understanding mimics human contextual knowledge [6] by recognizing that individual countries within the EMEIA region for the enterprise contribute to regional financial figures.
Situational Model and Metacognition
The situational model orchestrates and deploys AI Tools aimed at performing contextually relevant actions [8]. AI agents, acting like specialized workers, handle specific tasks such as data retrieval, processing, and analysis. The situational model interprets the end user query and decides which “AI Agents” to evoke, and at which sequence, what information to relay from various roles, such as those involved in the Enterprise Financial Database and the Mathematical Reasoning function. It also determines the appropriate response to provide to the user.
An LLM with a metacognition [9] applies self-evaluation in its planning and actions for revenue analysis. For example, it will capture the AI System’s “thinking logic”, create an “Audit Trail” of the actions taken, and “compare” the actions against its “enterprise AI policies and procedures”.
Fine-Tuning
Fine Tuning LLMs provides enterprise context specific to a company such as the method in which the enterprise calculates Revenue. By specifically tailoring the revenue calculation tool’s parameters to the distinct needs and data characteristics of each enterprise, fine-tuning optimizes LLM performance and improves reliability, particularly for specialized contexts, such as revenue calculations [13]. This process ensures AI is accurate and relevant to the specific company environments they are deployed in [14]. Such customization enhances the practicality of LLMs across the enterprise, demonstrating a sophisticated approach to integrating proprietary and domain-specific data directly into these models.
Conclusion
Institutional-grade Generative AI integrates functions such as metacognition for accuracy and self-assessment, and mathematical reasoning for detailed data analysis. It builds on a robust language foundation, merging factual, episodic, and contextual knowledge to facilitate precise task/query execution. The integration of contextual knowledge through knowledge graphs with LLMs refines AI understanding and output. Modulating LLMs' thinking processes to incorporate common sense mirrors human cognitive functions to provide contextual understanding and enhance relational knowledge. The inclusion of RAG helps overcome LLM limitations by providing up to date data and refining outputs, marking significant advancements in enterprise AI applications. Smaller models have been shown to perform at par with larger models with RAG [15]. This evolution in Generative AI showcases its potential to deliver accurate, context-aware decision-making, significantly enhancing enterprise applications.
Indeed, incorporating a knowledge graph to represent common-sense knowledge is a challenging and labor-intensive task. Curating the right knowledge can be time-consuming, as demonstrated by Doug Lenat's extensive work on his Cyc project [16]. However, in our approach, we can leverage LLMs to transform some of the tacit knowledge into explicit knowledge. This means that we can expedite the curation process by relying on the capabilities of LLMs. Moreover, our approach is bidirectional: we build the knowledge graphs using insights from the LLMs, and in turn, we specialize the LLMs using the structured information from the knowledge graphs. Thus, while the task is undoubtedly demanding, the use of LLMs and knowledge graphs in a mutually reinforcing manner provides a viable method to manage the complexity and labor involved.
The views reflected in this article are those of the authors and do not necessarily reflect the views of Ernst & Young LLP or other members of the global EY organization.
References
[1] Yuren Mao, Xuemei Dong, Wenyi Xu, Yunjun Gao, Bin Wei, and Ying Zhang. FIT-RAG: Black-Box RAG with Factual Information and Token Reduction. Zhejiang University; Zhejiang Gongshang University. China, 2024. https://arxiv.org/html/2403.14374v1.
[2] Diego Sanmartín. KG-RAG: Bridging the Gap Between Knowledge and Creativity. IE University, Spain, 2024. https://arxiv.org/pdf/2405.12035.
[3] Jiarui Li, Ye Yuan, and Zehua Zhang. Enhancing LLM factual accuracy with RAG to counter hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases. Information Network Institute, Carnegie Mellon University, 2024. https://arxiv.org/pdf/2403.10446.
[4] David Murphy, Tony Paula, Wilhelm Staehler, Julia Vacaro, Guillermo Paz, Gabriel Marques, and Bruno Oliveira. A Proposal for Intelligent Agents with Episodic Memory. HP Labs – HP Inc., 2020. https://arxiv.org/pdf/2005.03182.
[5] Ye Liu, Yao Wan, Lifang He, Hao Peng, and Philip S. Yu. KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning. University of Illinois at Chicago, Chicago, IL, USA; Huazhong University of Science and Technology, Wuhan, China; Lehigh University, Bethlehem, PA, USA; Beihang University, Beijing, China, 2024. https://arxiv.org/pdf/2009.12677.
[6] Somnath Banerjee, Amruit Sahoo, Sayan Layek, Avik Dutta, Rima Hazra, and Animesh Mukherjee. Context Matters: Pushing the Boundaries of Open-Ended Answer Generation with Graph-Structured Knowledge Context. Indian Institute of Technology Kharagpur & Singapore University of Technology and Design, 2024. https://arxiv.org/pdf/2401.12671.
[7] Ankit Satpute, Noah Gie?ing, André Greiner-Petter, Moritz Schubotz, Olaf Teschke, Akiko Aizawa, and Bela Gipp. Can LLMs Master Math? Investigating Large Language Models on Math Stack Exchange. FIZ Karlsruhe; University of G?ttingen; NII Japan, 2024. https://arxiv.org/pdf/2404.00344.
[8] Erik Blasch, Robert Cruise, Alexander J. Aved, Uttam Majumder, and Todd Rovito. Methods of AI for Multimodal Sensing and Action for Complex Situations, 2019. https://ojs.aaai.org/aimagazine/index.php/aimagazine/article/view/4813.
[9] Jason Toy, Phil Tabor, and Josh MacAdam. Metacognition is all you need? Using Introspection in Generative Agents to Improve Goal-directed Behavior, 2024. 2401.10910 (arxiv.org).
[10] Ali Mahboub, Muhy Eddin Za’ter, Bashar Al-Rfooh, Yazan Estaitia, Adnan Jaljuli, and Asma Hakouz. Evaluation of Semantic Search and its Role in Retrieved-Augmented-Generation (RAG) for Arabic language. Maqsam. Amman, Jordan, 2024. https://arxiv.org/pdf/2403.18350.
[11] Jaewoong Kim and Moohong Min. From RAG to QA-RAG: Integrating Generative AI for Pharmaceutical Regulatory Compliance Process. Department of Applied Data Science, Sungkyunkwan University; Social Innovation Convergence Program, University College, Sungkyunkwan University, 2024. https://arxiv.org/pdf/2402.01717v1.
[12] Stephan Raaijmakers, Roos Bakker, Anita Cremers, Roy de Kleijn, Tom Kouwenhoven, and Tessa Verhoef. Memory-augmented generative adversarial transformers. Leiden University Centre for Linguistics (LUCL); TNO, The Netherlands; University of Applied Sciences, Utrecht; Institute of Psychology, Leiden University; Leiden Institute of Advanced Computer Science (LIACS), 2024. https://arxiv.org/pdf/2402.19218.
[13] Angels Balaguer, Vinamra Benara, Renato Cunha, Roberto Estev?o, Todd Hendry, Daniel Holstein, Jennifer Marsman, Nick Mecklenburg, Sara Malvar, Leonardo O. Nunes, Rafael Padilha, Morris Sharp, Bruno Silva, Swati Sharma, Vijay Aski, Ranveer Chandra. RAG vs Fine-Tuning: Pipelines, Tradeoffs, and a Case Study On Agriculture. Microsoft, 2024. https://arxiv.org/pdf/2401.08406.
[14] Nikhil Prakash, Tamar Rott Shaham, Tal Haklay, Yonatan Belinkov, and David Bau. Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking. Northeastern University, MIT CSAIL, Technion – IIT, 2024. https://arxiv.org/pdf/2402.14811.
[15] Humza Naveeda, Asad Ullah Khana, Shi Qiub, Muhammad Saqib, Saeed Anwar, Muhammad Usmane, Naveed Akhtar, Nick Barnes, Ajmal Mian. A Comprehensive Overview of Large Language Models, 2024. https://arxiv.org/pdf/2307.06435.
[16] Nivash Jeevanandam. AI insights - Exploring Cyc - An AI project for comprehensive ontology. AI Research, 2022. https://indiaai.gov.in/article/exploring-cyc-an-ai-project-for-comprehensive-ontology.
AI Strategy | Advisory | Google PMLE | Star Performer | Learning Catalyst | Data Science Mentor
2 个月Thanks so much Boris! Sure, you have rocked the boat!!
Chief Data Officer | Expert in Digital Transformation, Data Management, and AI | Data Governance & Strategy | CDO
2 个月Thanks for sharing