What is RAG and Why Should You Care?
Aymen Noor
IT Graduate | GenAI & NLP Enthusiast | Machine Learning | Open-Source Contributor | 14X Hackathon Participant | 2X Winner |Section Leader at Stanford Code in Place
RAG, or Retrieval-Augmented Generation, merges pre-trained language models with a retrieval system, enabling access to external knowledge sources like the internet or documents for more accurate and reliable responses.?
Imagine giving a super-smart LLM a buddy who can fetch helpful information from the internet and other sources. This buddy, combined with the LLM language skills, makes sure it gives you really accurate and reliable answers. And the cool part is, they can keep learning and getting better without starting from scratch each time! It's like having a clever friend who always has the right facts at hand.?
The operational framework of RAG involves two primary components: a retrieval module and a generation module.?
Steps in Crafting a Simple RAG with Chroma: ???
Let's break down the five essential steps in creating an effective Retrieval Augmented Generation system.?
Data preparation: ?
Loading relevant data and break it into smaller chunks.?
Adjusting chunk size for a balance between granularity and noise with chunk_size.
loader = DirectoryLoader('./articles/', glob="./*.txt", loader_cls=TextLoader)
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = text_splitter.split_documents(documents)?
chunk_overlap is a parameter that determines how much adjacent chunks of text share in common when using a fixed-size chunking.
Embedding Functions:?Embedding functions play a crucial role in the transformation of data chunks into numerical vectors. embedding functions can be accessed from the Embedding Function class. Additionally, for a more tailored approach, users have the flexibility to create custom embedding functions.
Embedding functions serve a dual purpose in the retrieval augmented generation process:
from chromadb.utils import embedding_functions
from openai import OpenAI
OPENAI_API_KEY= os.getenv("OPENAI_API_KEY")
embedding_function = embedding_functions.OpenAIEmbeddingFunction(
api_key=OPENAI_API_KEY,
model_name="text-embedding-ada-002"
)
Vector Database: Stores pre-computed embeddings of all documents for quick retrieval. The embedding function, along with the associated data, is supplied to a Vector Database instance.
领英推荐
chroma_client = chromadb.Client()
vector_store = chroma_client.get_or_create_collection(name="Articles",
embedding_function=embedding_function)
vector_store.add(ids=id_list, documents=text_content_list,metadatas=metadata_list)
Context retrieval: Retrieve context of query using query vector to search vector database and identify relevant documents based on vector similarity to the query. ?
results = vector_store.query(
query_texts=query,
n_results=2
)
Query to LLM:?Retrieved documents shape the generation module's context, influencing the final output. This step involves feeding the enriched prompt, informed by the retrieved context, into the LLM for the final output generation.
import google.generativeai as genai
message = [{"role": "user",
"parts": [
f"We have provided context information below. \n"
f"---------------------\n"
f"{results}"
f"\n---------------------\n"
f"Given this information, please answer the question:{query}"
]
}]
model = genai.GenerativeModel('gemini-pro')
response = model.generate_content(message)
response_txt = response.text
LLM Challenges and how RAG solves them:?
Large Language Models (LLMs) are like super-smart computers that write human-like text, but they have some problems. Here's Retrieval-Augmented Generation (RAG) helps:?
CEO and co-founder at Pigro
9 个月Nice! Just one point: there are today more sophisticated chunking solutions that split documents based on the original document layout and content semantics, like https://preprocess.co . This way you can have optimal chunks of text for each document type.
IT Student | Media lead GDSC | Web development | cyber security | cloud computing | GenAI | web3
9 个月Very helpful- looking forward for more ????
Done-for-You Client Acquisition Engine for Coaches & Consultants using Email & Linkedin ?? ? 5+ New Clients GUARANTEED in 90 Days ? LinkedIn? Selling Expert
10 个月Looking forward to diving into your article on RAG, it sounds like a game-changer! ??