登录查看更多内容

Generative AI Architectural Patterns

Debmalya Biswas

AI/Analytics @ Wipro | x- Nokia, SAP, Oracle | 50+ patents | PhD - INRIA

发布日期: 2023年12月10日

+ 关注

A short primer on the 5 most prevalent Generative AI architectural patterns today:

Black-box LLM APIs
Enterprise Apps in LLM App Store
LLMOps?—?LLM fine-tuning to domain specific?SLMs
Retrieval Augmented Generation (RAG)
AI Agents: Multi-agent LLM Orchestration

Fig: Generative AI Architecture Patterns

1. Black-box LLM?APIs

This is your classic ChatGPT [1] example, where we have black-box access to a LLM API/UI. Similar LLM APIs can be considered for other Natural Language Processing (NLP) core tasks, e.g., Knowledge Retrieval, Summarization, Auto-Correct, Translation, Natural Language Generation (NLG).

Prompts are the primary interaction mechanism here and we are all still trying to perfect our Prompt Engineering skills -:)

Prompts refers to adapting the user input, providing the right context and guidance to the LLM API?—?to maximize the chances of getting the ‘right’ response. It has led to the rise of Prompt Engineering as a professional discipline, where prompt engineers systematically perform trials, recording their findings, to arrive at the ‘right’ prompt to elicit the ‘best’ response.

2. Enterprise Apps in LLM App?Store

OpenAI's recent announcement to launch a GPT App Store is interesting (link ). It is to be expected that other major players, e.g., Google, AWS, Hugging Face, will follow suit). The motive is clear - to become the preferred platform for Generative AI (GenAI) / Large Language Model (LLM) adoption. However, there is also a risk that enterprise apps published on the platform will overshadow the underlying platform.

It remains to be seen if the GenAI Apps Store will turn out be as much of a game changer as the Apple App Store was for iPhone / Mobile devices — interesting times ahead!

While Enterprise GPT Apps have the potential to be a multi-billion dollar marketplace and accelerate LLM adoption by providing an enterprise ready solution; the same caution needs to be exercised as you would do before using a 3rd party ML model?—?validate LLM/training data ownership, IP, liability clauses [2].

Data ownership: Data is critical for Supervised AI/ML systems, esp. so for LLMs which are often trained on public datasets, whose data usage rights for AI/ML training are not well defined and can evolve in future. For example, Reddit recently announced (link ) that it will start charging for Enterprise AI/ML models learning from its extremely human archives.

Given this, negotiation of ownership issues around not only training data, but input data, output data, and other generated data is critical.

On the other hand, it is also important to understand / assess how the Enterprise App Provider will be using the data received / generated as a result of its interactions with the users.

3. LLMOps?—?LLM fine-tuning to domain specific?SLMs

LLMs are generic in nature, as they are trained on public datasets, e.g., Wikipedia. To realize the full potential of LLMs for Enterprises, they need to be contextualized with enterprise knowledge captured in terms of documents, wikis, business processes, etc.

This is achieved by fine-tuning a LLM with enterprise knowledge / embeddings to develop a context-specific LLM [3].

Fig: Enterprise LLM contextualization strategy

Fine-tuning entails taking a pre-trained Large Language Model (LLM), and retraining it with (smaller) enterprise data. Technically, this implies updating the weights of the last layer(s) of the trained neural network to reflect the enterprise data and task.

Given this, access to the base model weights is needed to perform fine-tuning, which is not possible for closed models, e.g., ChatGPT.

This is where open-source pre-trained LLMs come to the rescue, e.g., Meta AI, who recently open-sourced their LLM - LLaMA . The Stanford Alpaca project showed that it is possible to fine-tune LLaMA for $600 - to a model performance comparable with ChatGPT.

Jean KO?VOGUI 6 个月前

All About LLMs

Lightning AI 1 年前

How Generative AI Is Disrupting the Data Economy and…

Renato Azevedo Sant Anna 12 个月前

So fine-tuning a LLM does not necessarily need to be very complex or expensive.

Given that the enterprise is responsible for the ML (fine-tuning) pipeline in this case, LLMOps (MLOps [4] for LLMs) is needed to deliver this in a scalable fashion.

LLMOps can be considered as more complex than usual MLOps pipelines, esp. to enable the continuous improvement feedback loop?—?Reinforcement Learning from Human Feedback (RLHF) [5].

LMFlow (link ) is a good example of an emerging MLOps framework for LLMs.

4. Retrieval Augmented Generation (RAG)

Fine-tuning is a computationally intensive process. RAG provides a viable alternative by providing additional context with the prompt, grounding the retrieval / responses to the given context.

The prompts can be relatively long, so it is possible to embed enterprise context within the prompt. For example, referring to the below solution architecture on Azure, the Cognitive Search Results are provided as additional context with the prompt?—?to limit the responses.

Fig: Integrating Azure Cognitive Search Results to contextualize Prompts (Source: [6])

Below is how the same RAG reference architecture looks like on Databricks:

Fig: Reference architecture to implement RAGs on Databricks (Source: [7])

5. AI Agents: Multi-agent LLM Orchestration

This is the future where enterprises will be able to develop new Enterprise AI Apps by orchestrating / composing multiple existing AI Apps.

The discussion around ChatGPT, has now evolved into AutoGPT. While ChatGPT is primarily a Chatbot that can generate text responses, AutoGPT is a more powerful AI Agent that can execute complex tasks, e.g., make a sale, plan a trip, make a flight booking, book a contractor to do a house job, order a pizza. LangChain (link ) is probably the most mature framework today to compose LLMs.

However, designing and deploying AI Agents remains challenging in practice. Below are some initial thoughts around the essential components / frameworks needed to materialize such an AI Agent Platform:

Given a user task, the goal of an AI Agent Platform is to identify (compose) an agent (group of agents) capable to executing the given task.

Orchestration Layer (Task decomposition into an Orchestration Engine executed by the Orchestration Engine)

AI Agents follow a long history of research around Autonomous Agents, especially, Goal oriented Agents. A high-level approach to solving such complex tasks involves: (a) decomposition of the given complex task into (a hierarchy or workflow of) simple tasks, followed by (b) composition of agents able to execute the simple(r) tasks. This can be achieved in a?dynamic?or?static?manner.

In the dynamic approach, given a complex user task, the system comes up with a plan to fulfill the request depending on the capabilities of available agents at run-time. In the static approach, given a set of agents, composite agents are defined manually at design-time combining their capabilities.

Agent Marketplace: This implies that there exists a marketplace / registry of agents - with a well-defined description of the agent capabilities and constraints. We have studied the discovery aspect of Agents in detail in [8].
Integration layer supporting different Agent Interaction Patterns, such as, Agent-to-Agent API, Agent API providing Output for Human consumption, Human triggering an AI Agent, AI Agent-to-Agent with Human in the Loop. The integration patterns need to be supported by the underlying LLMOps? platform.
Shared memory layer enabling data transfer between Agents, storing interaction data such that it can be used to personalize future interactions.
Privacy & Security: Ensure that data shared by the user specific to this task, or user profile data that cuts across tasks; is only shared with the relevant Agents (authentication & access control).

References

D. Biswas. ChatGPT internals, and its implications for Enterprise AI. (link )
D. Biswas. Generative AI Design Patterns. (link )
D. Biswas. Contextualizing Large Language Models (LLMs) with Enterprise Data. (link )
D. Biswas. MLOps for Compositional AI. NeurIPS 2022 Workshop on Challenges in Deploying and Monitoring Machine Learning Systems (DMML). (link )
E. Ricciardelli, D. Biswas. Self-improving Chatbots based on Reinforcement Learning. in: 4th Multidisciplinary Conference on Reinforcement Learning and Decision Making, 2019. (link )
P. Castro. Revolutionize your Enterprise Data with ChatGPT: Next-gen Apps w/ Azure OpenAI and Cognitive Search. Azure AI Blog (link )
Databricks blog. What is Retrieval Augmented Generation or RAG? (link )
D. Biswas. Constraints Enabled Autonomous Agent Marketplace: Discovery and Matchmaking. In proc. of the 16th International Conference on Agents and Artificial Intelligence (ICAART), 2024, 124617.pdf ( scitepress.org )

Brian M. Green

Chief AI Ethics Officer | Founder of Health-Vision.AI | Expert in AI Governance, Strategy, and Responsible AI Practices

3 个月

This is a very good summary article. I really like the inclusion of the various figures. Is the Figure "AI Agent Reference Architecture" one you created?

1 次回应

Chuong Ho

Researcher | Technical Writing | Autodesk Expert Elite

8 个月

That awesome, thank you for sharing !

1 次回应

Mudassir Ahmed Mohammad

DevOps Solution Architect - Azure | Python | Kubernetes | LLMOps | GenAI

8 个月

Great work

1 次回应

Krishnan Sankarasubramanian

Principal Consultant @ Wipro Digital | Strategy Consulting, Corporate Strategy | Pre-Sales and Growth | Transformation delivery

9 个月

Excellent primer Debmalya Biswas very comprehensive

1 次回应

Steven Forth

CEO Ibbaka Performance - Leader LinkedIn Design Thinking Group - Generative Pricing

9 个月

Would you consider GANs (Generative Adversarial Networks) to be a pattern that should be added to these?

查看更多评论

要查看或添加评论，请登录

查看全部

Generative AI Architectural Patterns

Debmalya Biswas

AI/Analytics @ Wipro | x- Nokia, SAP, Oracle | 50+ patents | PhD - INRIA

1. Black-box LLM?APIs

2. Enterprise Apps in LLM App?Store

3. LLMOps?—?LLM fine-tuning to domain specific?SLMs

领英推荐

4. Retrieval Augmented Generation (RAG)

5. AI Agents: Multi-agent LLM Orchestration

References

更多精彩文章

社区洞察

其他会员也浏览了

LLM vs. LQM

Top LLM Papers of the Week (July Week 2, 2024)

Frameworks of the Future: AI and LLM in Action for App Development

Understanding Transformers: A Deep Dive with PyTorch

How to Optimize LLM Performance with AI Agents

LangChain

A Comparative Analysis: GPT-4 and Falcon LLM

OpenSearch with AI

LLM Frameworks Demystified (Part 2): Thin LLM Wrappers

1. Black-box LLM?APIs

2. Enterprise Apps in LLM App?Store

3. LLMOps?—?LLM fine-tuning to domain specific?SLMs

领英推荐

4. Retrieval Augmented Generation (RAG)

5. AI Agents: Multi-agent LLM Orchestration

References

Agentic RAGs: consolidated querying of SQL & Document repositories

2024年10月29日

Unifying Data & Gen AI / LLM platforms

2024年10月16日

Conversational BI with Snowflake's Cortex Analyst

2024年10月3日

Stateful and Responsible AI?Agents

2024年8月25日

Conflicting Prompts, and the challenges in building Enterprise Prompt Stores

2024年8月17日

LLM Personalization: User Persona based Personalization of LLM generated Responses

2024年8月11日

Use-case based evaluation of LLMs

2024年7月21日

Gen AI Privacy: Privacy Risks of LLMs

2024年7月6日

Responsible LLMOps: Integrating Responsible AI practices into LLMOps

2024年6月16日

Delta Lake, Iceberg & Hudi: A Transactional Perspective

2024年6月9日

社区洞察

其他会员也浏览了

LLM vs. LQM

Top LLM Papers of the Week (July Week 2, 2024)

Frameworks of the Future: AI and LLM in Action for App Development

Understanding Transformers: A Deep Dive with PyTorch

How to Optimize LLM Performance with AI Agents

LangChain

A Comparative Analysis: GPT-4 and Falcon LLM

OpenSearch with AI

LLM Frameworks Demystified (Part 2): Thin LLM Wrappers