How does RAG work?
Introduction to RAG.

How does RAG work?

RAG stands for Retrieval-Augmented Generation.

Let's use an example to understand how RAG works. Suppose you want to know the price of BMW's latest model, the BMW M8, from an AI language model (LLM). You write a prompt like, "What is the price of the BMW M8?"

LLM responds with an incorrect price. The model may be confident in its response because it provided an answer based on the information it was trained on. However, training a model continuously can be costly and time-consuming, making it impractical to update the model every time new information is available.

One possible solution to retrieve the actual price of the BMW M8 is to connect LLM with a database. However, there's a challenge: LLM cannot understand natural languages like English or any other language in which the database is created. So, what can we do?

This is where a vector database comes into play. Why a vector database? Because it stores information as embeddings.


How does RAG work?


Now, let's explore how RAG works:

1. The user submits a search query in natural language.

2. The retrieval component of RAG retrieves relevant information from a knowledge source, such as a large text corpus or a database, based on the user's query.

3. The retrieved passages are ranked and selected based on their relevance to the user's query.

4. The selected passages are encoded in a contextualized representation, capturing the relevant information and the context in which it appears.

5. The user's query is encoded to create a representation that captures its meaning and context.

6. The encoded query is scored against the encoded passages to determine the most relevant information for generating a response.

7. A language generation model, such as GPT-3, takes the selected knowledge and the user's query as input and generates a response based on the combined information.

8. The final output is the generated response.

By leveraging a vector database and the retrieval-augmented generation approach, RAG enables accurate and contextually relevant responses to user queries.

Jason Quist

Founder @BoardAndGo | Aviation | Agentic TravelTech | AI Research | Deep Learning & Reinforcement Learning

1 年

Great insights

要查看或添加评论,请登录

Muhammad Arham的更多文章

  • OpenAI just released GPT-4.

    OpenAI just released GPT-4.

    #OpenAI just released #GPT4. It is just mind blowing.

    1 条评论
  • What is #leetcode?

    What is #leetcode?

    What is #leetcode? Leetcode is a website where we can have a list of questions that were asked in interviews of…

  • What is Computer #programming?

    What is Computer #programming?

    What is Computer #Programming? Programming is something that computers can understand. It is in the form of 0's and 1's.

社区洞察

其他会员也浏览了