How does RAG work?
Muhammad Arham
Graduate Aspirant | Software Engineer | React Developer | AI & ML Enthusiast | Hackathon finalist lablab.ai| Trainer @ iCodeGuru | LeetCode | Innovating Through Code
RAG stands for Retrieval-Augmented Generation.
Let's use an example to understand how RAG works. Suppose you want to know the price of BMW's latest model, the BMW M8, from an AI language model (LLM). You write a prompt like, "What is the price of the BMW M8?"
LLM responds with an incorrect price. The model may be confident in its response because it provided an answer based on the information it was trained on. However, training a model continuously can be costly and time-consuming, making it impractical to update the model every time new information is available.
One possible solution to retrieve the actual price of the BMW M8 is to connect LLM with a database. However, there's a challenge: LLM cannot understand natural languages like English or any other language in which the database is created. So, what can we do?
This is where a vector database comes into play. Why a vector database? Because it stores information as embeddings.
Now, let's explore how RAG works:
领英推荐
1. The user submits a search query in natural language.
2. The retrieval component of RAG retrieves relevant information from a knowledge source, such as a large text corpus or a database, based on the user's query.
3. The retrieved passages are ranked and selected based on their relevance to the user's query.
4. The selected passages are encoded in a contextualized representation, capturing the relevant information and the context in which it appears.
5. The user's query is encoded to create a representation that captures its meaning and context.
6. The encoded query is scored against the encoded passages to determine the most relevant information for generating a response.
7. A language generation model, such as GPT-3, takes the selected knowledge and the user's query as input and generates a response based on the combined information.
8. The final output is the generated response.
By leveraging a vector database and the retrieval-augmented generation approach, RAG enables accurate and contextually relevant responses to user queries.
Founder @BoardAndGo | Aviation | Agentic TravelTech | AI Research | Deep Learning & Reinforcement Learning
1 年Great insights