How to leverage Generative-AI for external facing applications using Retrieval-Augmented Generation (RAG)
In my role as an architect at one of the most forward-looking companies in the world, I often interact with my peers from both inside and outside of my company on topics related to leveraging Generative AI for enterprise use cases – and while almost no one is opposed to the idea, I do sense a palpable hesitancy when the conversation turns to external (customer or end-user) facing applications.
I agree with that sentiment.?
Despite the rapid maturity of Large Language Models (LLMs) over the last couple of years, their knowledge is limited to the volume of “parameters” and “tokens” that they were pre-trained on. This means that even if they are fine-tuned with the latest ML techniques like Reinforcement Learning from human feedback (RLHF), Reward modeling, etc. a model’s knowledge is restricted to a snapshot of time and more importantly they lack exposure to data/information internal to an enterprise leading to generic output referred to as "hallucinations".
As you can imagine a model’s inability to output factual, in-context, and relevant information inhibits an enterprise from being able to confidently expose the generated content to its external users like customers, partners, distributors, etc. - which is especially critical for companies in highly regulated industries such as Healthcare, Life-Science, Pharmaceutical, Finance, Insurance, Telecom, etc.
This is where Retrieval-Augmented Generation (RAG) can help.
In this blog, we will discuss the underlying mechanism behind RAG and how it can be used in an enterprise setting to increase the quality/accuracy of content generated by an LLM so it can be used with a higher degree of confidence.
So what is RAG?
RAG, or Retrieval-augmented generation, is a cutting-edge AI framework that enhances large language models (LLMs) by incorporating external knowledge sources. LLMs are notorious for, occasional inconsistencies due to their statistical understanding of words without semantic depth.
I talked about the key considerations when deploying enterprise-scale Generative-AI at your organization in one of my previous blogs.
RAG strategically grounds LLMs with contemporary information that it wasn’t trained on and allows them to have an “open-book” approach to answering the questions thus elevating the quality of responses to be the latest and accurate because it is based on facts retrieved from the reliable sources.
How does RAG work?
RAG optimizes organizational data from structured databases to unstructured documents—by standardizing it into a knowledge repository for seamless integration with a Gen-AI system. This involves employing an embedded language model algorithm for numerical representation stored in a vector database streamlining the retrieval process.
The actual process happens in two steps – i.e., retrieval and augmented content generation where, algorithms extract pertinent information based on the user's prompt, sourcing from indexed internet documents in a curated set of enterprise sources and then augment it with enriched internal training data to craft a bespoke response.
What data sources to use for RAG?
Given the idea behind RAG is to have your LLMs access content that can be verified and trusted, you must identify sources of data and information within your enterprise that it can use.
领英推荐
Some examples could include your ERP databases housing financial transactions, supply chain data, and other operational insights or your CRM systems with proprietary customer databases detailing customer interactions, preferences, transaction history, etc.? They could also be internal document repositories storing corporate policies, project specifications, technical documentation, troubleshooting guides, best practices, and team collaborations.
For a use case such as an AI "co-pilot" for a customer service team, you could use data from your ticketing system with a history of customer interactions, chat transcripts, call logs, and documented resolutions so your human agents can be more effective/efficient in providing accurate, consistent and up-to-date information leading to higher customer satisfaction.
How is RAG applied?
While there are many use cases where RAG can be a game-changer technique, let me touch on a couple that I can see from my industry background.
What to watch out before using RAG?
Before embarking on RAG for your enterprise needs make sure you have the proper data governance and identified sufficient sources of knowledge base in-house. Rigorous validation mechanisms must be in place to ensure the accuracy and reliability of both internal and externally sourced data that would be referenced by RAG. ?Any deployment of RAG in an externally hosted LLM will require several security measures including encryption algorithms, access control mechanisms, and robust audit trails are technical components critical for safeguarding sensitive information.
Also, for any in-house hosted models, technical compatibility with existing enterprise systems through industry-standard methods such as API, webhooks, etc. for seamless integration. The utilization of well-defined APIs facilitates interoperability, allowing RAG to seamlessly align with AI applications, collaboration tools, and databases of your enterprise infrastructure minimizing disruptions during implementation.
Conclusion
Despite being a relatively new technique with Natural Language Processing (NLP), RAG is an effective solution for enterprises to adopt Generative-AI for external users. ?
While there are many tools in an ML toolkit to mold LLMs into your enterprise needs, RAG should be one of the logical options as it grounds LLMs on the latest, verifiable information mainly because other options are either time-consuming and/or cost-prohibitive. RAG’s versatility, lends itself as an ideal solution for external-facing use cases like chatbots, email, text messaging, and other Generative-AI based applications. ?
I hope you find this blog useful, please share your feedback through comments on this or other topics you would like me to discuss in the future.
VP of Technology
1 年I always learn something new when you post Phane?? Corporate policy around use of AI / LLMs is few and far between from what I have seen so far. A big gap considering how quickly this is all evolving.
Digital Health, & SaMD products strategist / Privacy Design/ AI/ML/GenAI Solutions Strategist /Healthcare Systems Integration/ Cybersecurity/ Enterprise Architect/ Chief Research Officer,/ Thought Leader and Author.
1 年Phane Mane, Good thoughts. I think theoretically, the RAG (an AI Framework) is workable but in real world it is limited to 'structured' content to build a ocal enterprise 'knowledgebase'. However, transforming unstructured data (aka, transactional, ERPs, Sales. EMRs,) to a usable knowledgebase suitable for LLMs is not feasible because you have to have some syntax/gramper and semantics for the structured data to build a useable 'knowledgebase.. (not database). Just like you cant use LLM for ECG or numerical data for LLM but you may place ECG report (which has some context/syntax) to abstract meaningful insight for prompts engineering. Way back ( see attached pic -20+ years back in 2000-2004), I published RAG like Enterprise Embedded Intelligent Reference model that outlines how conversational apps and intelligent agents (bots) using harmonized structured/unstructured content-- which LLMs are trying to do... This is a higher level pic but in my featured articles, there is more details on its building blocks. That framework was very similar to what IBM called RAG an AI Framework. but not an architecture and deployment model. If interested on overall architecture, let me know.
Director at Sapours Technologies Pvt Ltd
1 年Good Insight.Thanks for sharing Phane.