登录查看更多内容

What Is Retrieval-Augmented Generation, RAG?

Ayesha Shahzad

Google Advanced Data Analytics Certified | Exploring Data Science | Machine Learning | Deep Learning | Mathematician

发布日期: 2024年5月21日

Retrieval-augmented generation (RAG) is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources.

Why is Retrieval-Augmented Generation (RAG) important?

RAG addresses some key challenges with large language models, including:

Knowledge cutoff: LLMs have limited knowledge based on what they were trained on. RAG provides access to external knowledge, enabling LLMs to generate more accurate and reliable responses.
Hallucination risks: LLMs may generate responses that are not factually accurate or relevant to the query. RAG allows LLMs to draw upon external knowledge sources to supplement their internal representation of information, reducing the risk of hallucinations.
Contextual limitations: LLMs lack context from private data, leading to hallucinations when asked domain or company-specific questions. RAG provides up-to-date information about the world and domain-specific data to your GenAI applications, enabling them to generate more informed answers.
Auditability: RAG allows GenAI to cite its sources and improves auditability, making it easier to track the sources of information used to generate responses.

How does Retrieval-Augmented Generation (RAG) work?

RAG has two phases: retrieval and content generation. In the retrieval phase, algorithms search for and retrieve snippets of information relevant to the user’s prompt or question. The retrieved context can come from multiple data sources, such as document repositories, databases, or APIs. The retrieved context is then provided as input to a generator model, which is typically a large language model (LLM). The generator model uses the retrieved context to inform its generated text output, producing a response that is grounded in the relevant facts and knowledge.

To make the formats compatible, a document collection, or knowledge library, and user-submitted queries are converted to numerical representations using embedding language models. Embedding is the process by which text is given numerical representation in a vector space. RAG model architectures compare the embeddings of user queries within the vector of the knowledge library. The original user prompt is then appended with relevant context from similar documents within the knowledge library. This augmented prompt is then sent to the foundation model.

To understand Retrieval-Augmented Generation (RAG) in a simple way, you can use a straightforward analogy and a practical example.

Analogy: The Librarian and the Storyteller

Imagine a classroom with two key figures:

The Librarian: This person knows where every book in the library is located and can quickly find specific information from these books.
The Storyteller: This person is excellent at weaving stories but relies on information provided by the librarian to ensure accuracy and detail.

When the storyteller needs to create a new story on a specific topic, they ask the librarian to fetch relevant books and information. Using this information, the storyteller crafts a well-informed and accurate story.

Practical Example:

Let's apply this analogy to a simple example. Assume we want to generate a short paragraph about a historical event using RAG.

Data Science Dojo 1 个月前

?? Moving beyond RAG

Pascal Biese 8 个月前

Advanced Retrieval-Augmented Generation (RAG) for…

Anand Ramachandran 2 个月前

Step-by-Step Example:

Initial Query
Retrieval Step (The Librarian)
Generation Step (The Storyteller)

Initial Query:

User's Input: "Tell me about the Moon Landing in 1969.

Retrieval Step:

The system first searches its internal or external knowledge base (like a database of documents or the internet) to find relevant information about the Moon Landing in 1969. For simplicity, let's assume it retrieves the following two key pieces of information:

"Apollo 11 was the spaceflight that first landed humans on the Moon. Commander Neil Armstrong and lunar module pilot Buzz Aldrin formed the American crew that landed the Apollo Lunar Module Eagle on July 20, 1969."

"Neil Armstrong became the first person to step onto the lunar surface, and Buzz Aldrin joined him 19 minutes later. They spent about two and a quarter hours together outside the spacecraft, and collected 47.5 pounds of lunar material to bring back to Earth."

Generation Step:

Using the retrieved information, the system (storyteller) generates a coherent and informative paragraph:

"In 1969, NASA achieved a monumental milestone with the Apollo 11 mission. On July 20, astronauts Neil Armstrong and Buzz Aldrin made history as they became the first humans to land on the Moon. Armstrong, the mission commander, was the first to step onto the lunar surface, followed by Aldrin. Together, they spent over two hours exploring the Moon and collected nearly 48 pounds of lunar rocks and soil to bring back to Earth."

What Is Retrieval-Augmented Generation, RAG?

Ayesha Shahzad

Google Advanced Data Analytics Certified | Exploring Data Science | Machine Learning | Deep Learning | Mathematician

Why is Retrieval-Augmented Generation (RAG) important?

How does Retrieval-Augmented Generation (RAG) work?

Analogy: The Librarian and the Storyteller

Practical Example:

领英推荐

Step-by-Step Example:

Initial Query:

Retrieval Step:

Generation Step:

更多精彩文章

社区洞察

其他会员也浏览了

Understanding Retrieval-Augmented Generation (RAG) in AI

Introduction to Knowledge Graphs

How to Build Powerful LLM Apps with Vector Databases + RAG - AI&YOU #55

Tired of unreliable, generic AI solutions? Here's how to build your own powerful local RAG agent with LLaMA3!

Part 3: Implementing RAG – Retrieval-Augmented Generation for Powerful AI Applications

Sharing Indexes and Vectors Across Platforms for Search and AI Use Cases

An Analysis of LangChain's Reusability in LLMs: Challenges and Insights

Using Taxonomy and Ontology for Structuring Search Spaces in AI Systems

Edition 28 – How Well Do LLMs Conduct Numeric Evaluations?

Generative AIs & Elasticsearch

Why is Retrieval-Augmented Generation (RAG) important?

How does Retrieval-Augmented Generation (RAG) work?

Analogy: The Librarian and the Storyteller

Practical Example:

领英推荐

Step-by-Step Example:

Initial Query:

Retrieval Step:

Generation Step:

Role of Autoencoders in Shaping the Future of AI

2024年8月7日

Boosting Algorithms Demystified: GBM vs. XGBoost

2024年4月20日

Understanding K-means Clustering: A Comprehensive Guide

2024年4月16日

社区洞察

其他会员也浏览了

Understanding Retrieval-Augmented Generation (RAG) in AI

Introduction to Knowledge Graphs

How to Build Powerful LLM Apps with Vector Databases + RAG - AI&YOU #55

Tired of unreliable, generic AI solutions? Here's how to build your own powerful local RAG agent with LLaMA3!

Part 3: Implementing RAG – Retrieval-Augmented Generation for Powerful AI Applications

Sharing Indexes and Vectors Across Platforms for Search and AI Use Cases

An Analysis of LangChain's Reusability in LLMs: Challenges and Insights

Using Taxonomy and Ontology for Structuring Search Spaces in AI Systems

Edition 28 – How Well Do LLMs Conduct Numeric Evaluations?

Generative AIs & Elasticsearch