Stick a RAG in it…
Have you ever looked at a tender or questions from a customer or client and thought – “These will be easy to answer! I know exactly where that information is..” – me neither, but with the current prominence of Generative AI, maybe that could be the answer. After a recent deluge of questions and some time on my hands, I decided maybe it deserved some investigation.
Is Generative AI the answer?
We all work in complex roles within data and information-dense organisations. Navigating internal and external data to find the correct answers for customers, partners, and colleagues is often challenging. I’m sure everyone reading this has at least tried out one of the current crop of excellent Generative AI platforms. One question keeps jumping into my mind: Can this generation of Large Language Models be used to make my work more efficient while ensuring good information security and compliance with business policies? At first glance, the answer appears to be no. Most organisations have placed heavy restrictions on (or completely banned) employees from using publicly hosted or operated Generative AI for reasons such as:
Usually, all these points would drive people to cope with the lousy search tools and continue hunting down disparate information. However, I have a history with AI, being a recovered LISP programmer; I know that sometimes you need to accept you need 60 sets of nested brackets and continue looking deeper. So, I started my quest for answers – well, something that could generate answers…
The current generation of Large Language Models (LLMs) excels in communicating with humans in natural language. These models are typically trained on vast public datasets, making them good at providing general information and generating responses even when they lack specific knowledge. However, they do not handle specialised, private, or domain-specific information. To enable Generative AI to access non-public information or to deliver answers based on a curated set of specific data, it's necessary to supplement the LLM with this extra information
.The current common approaches to extending LLM’s are finetuning and Retrieval Augmented Generation. Let's have a look at how these approaches might fit.
Fine Tuning
In finetuning, we effectively send the LLM back to school, teaching it new subjects and directly embedding further information into the Large Language Model. This approach allows the models to be adapted to specific knowledge domains or information sets. This process is one way (like school) you can’t easily remove information finetuned into the model, and it will need to be constantly tuned if the domain or information set changes.
Advantages of Fine-tuning
Disadvantages
领英推荐
RAG – Retrieval Augmented Generation
If finetuning is like sending the LLM back to school. In contrast, Retrieval Augmented Generation (RAG) is like a skilled debater or orator who references a near-instantaneous, well-stocked library to answer questions.
In this analogy, the debater (the language model) is already knowledgeable and articulate, but they turn to a library (the external database) for specific facts, quotes, or references. This enhances their arguments with accurate, detailed, and relevant information beyond their internal knowledge, just as RAG uses external data to improve the language model's responses.
This approach gives Generative AI access to well-curated domain knowledge without the need to directly embed the domain information into the model.
Eating data chunks
This dramatically simplifies the process of curating, updating and removing information, providing an agile and adaptable approach. This model allows us to integrate information sources, whether flat files, documents, existing information stores or data-driven systems. However, for the best experience, the information for the knowledge database needs to be prepared and processed in a specific way to get the best results out of Retrieval Augmented Generative systems. We must create a “semantic embeddings” store from our information set to achieve this.
Large Language Model (LLM) semantic embeddings are a method to convert words or text into numbers; these numbers allow computers to understand and process language. Imagine each piece of text as a point in a massive multi-dimensional space; texts with similar meanings are close together, while different meanings are farther apart. These embeddings are created by a Large Language Model (LLM) trained explicitly on large text datasets, enabling it to recognise patterns and context in language. Essentially, these embeddings are like a detailed map of language, with every word or phrase having unique coordinates that represent its meaning, helping computers "understand" language.
Advantages of RAG with LLMs
Disadvantages of RAG with LLMs
There are other approaches to enabling LLMs to access additional domain-specific information, but for this application Retrieval Augmented Generation meets all my requirements. Having figured out a method, the next step was to figure out how to build a RAG system of my own. So began project HAILE (Hallucinating Artificial Intelligent Language Elucidator) …
In the next post, we'll look at building an RAG system from Opensource libraries and tools that can take your data and produce valuable answers…