What Is Retrieval-Augmented Generation, RAG?
Ayesha Shahzad
Google Advanced Data Analytics Certified | Exploring Data Science | Machine Learning | Deep Learning | Mathematician
Retrieval-augmented generation (RAG) is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources.
Why is Retrieval-Augmented Generation (RAG) important?
RAG addresses some key challenges with large language models, including:
How does Retrieval-Augmented Generation (RAG) work?
RAG has two phases: retrieval and content generation. In the retrieval phase, algorithms search for and retrieve snippets of information relevant to the user’s prompt or question. The retrieved context can come from multiple data sources, such as document repositories, databases, or APIs. The retrieved context is then provided as input to a generator model, which is typically a large language model (LLM). The generator model uses the retrieved context to inform its generated text output, producing a response that is grounded in the relevant facts and knowledge.
To make the formats compatible, a document collection, or knowledge library, and user-submitted queries are converted to numerical representations using embedding language models. Embedding is the process by which text is given numerical representation in a vector space. RAG model architectures compare the embeddings of user queries within the vector of the knowledge library. The original user prompt is then appended with relevant context from similar documents within the knowledge library. This augmented prompt is then sent to the foundation model.
To understand Retrieval-Augmented Generation (RAG) in a simple way, you can use a straightforward analogy and a practical example.
Analogy: The Librarian and the Storyteller
Imagine a classroom with two key figures:
When the storyteller needs to create a new story on a specific topic, they ask the librarian to fetch relevant books and information. Using this information, the storyteller crafts a well-informed and accurate story.
Practical Example:
Let's apply this analogy to a simple example. Assume we want to generate a short paragraph about a historical event using RAG.
领英推荐
Step-by-Step Example:
Initial Query:
User's Input: "Tell me about the Moon Landing in 1969.
Retrieval Step:
The system first searches its internal or external knowledge base (like a database of documents or the internet) to find relevant information about the Moon Landing in 1969. For simplicity, let's assume it retrieves the following two key pieces of information:
"Apollo 11 was the spaceflight that first landed humans on the Moon. Commander Neil Armstrong and lunar module pilot Buzz Aldrin formed the American crew that landed the Apollo Lunar Module Eagle on July 20, 1969."
"Neil Armstrong became the first person to step onto the lunar surface, and Buzz Aldrin joined him 19 minutes later. They spent about two and a quarter hours together outside the spacecraft, and collected 47.5 pounds of lunar material to bring back to Earth."
Generation Step:
Using the retrieved information, the system (storyteller) generates a coherent and informative paragraph:
"In 1969, NASA achieved a monumental milestone with the Apollo 11 mission. On July 20, astronauts Neil Armstrong and Buzz Aldrin made history as they became the first humans to land on the Moon. Armstrong, the mission commander, was the first to step onto the lunar surface, followed by Aldrin. Together, they spent over two hours exploring the Moon and collected nearly 48 pounds of lunar rocks and soil to bring back to Earth."
Data Analyst | Power BI Developer | Mathematician
6 个月It is very informative Thanks for sharing