登录查看更多内容

What is RAG and Why Should You Care?

Aymen Noor

IT Graduate | GenAI & NLP Enthusiast | Machine Learning | Open-Source Contributor | 14X Hackathon Participant | 2X Winner |Section Leader at Stanford Code in Place

发布日期: 2024年1月23日

RAG, or Retrieval-Augmented Generation, merges pre-trained language models with a retrieval system, enabling access to external knowledge sources like the internet or documents for more accurate and reliable responses.?

Imagine giving a super-smart LLM a buddy who can fetch helpful information from the internet and other sources. This buddy, combined with the LLM language skills, makes sure it gives you really accurate and reliable answers. And the cool part is, they can keep learning and getting better without starting from scratch each time! It's like having a clever friend who always has the right facts at hand.?

The operational framework of RAG involves two primary components: a retrieval module and a generation module.?

Steps in Crafting a Simple RAG with Chroma: ???

Let's break down the five essential steps in creating an effective Retrieval Augmented Generation system.?

Data preparation: ?

Loading relevant data and break it into smaller chunks.?

Adjusting chunk size for a balance between granularity and noise with chunk_size.

loader = DirectoryLoader('./articles/', glob="./*.txt", loader_cls=TextLoader)
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = text_splitter.split_documents(documents)?

chunk_overlap is a parameter that determines how much adjacent chunks of text share in common when using a fixed-size chunking.

Embedding Functions:?Embedding functions play a crucial role in the transformation of data chunks into numerical vectors. embedding functions can be accessed from the Embedding Function class. Additionally, for a more tailored approach, users have the flexibility to create custom embedding functions.

Embedding functions serve a dual purpose in the retrieval augmented generation process:

To convert data chunks into embedding?before storing in vector database.
To convert query into embedding for similarity search.

from chromadb.utils import embedding_functions
from openai import OpenAI
OPENAI_API_KEY= os.getenv("OPENAI_API_KEY")
embedding_function = embedding_functions.OpenAIEmbeddingFunction(
            api_key=OPENAI_API_KEY,
            model_name="text-embedding-ada-002"
            )

Vector Database: Stores pre-computed embeddings of all documents for quick retrieval. The embedding function, along with the associated data, is supplied to a Vector Database instance.

Danny Butvinik 1 年前

?? Has OpenAI Lost Its Edge?

Pascal Biese 5 个月前

? When Accuracy Isn't Enough - Don't Make This Mistake

Pascal Biese 4 个月前

chroma_client = chromadb.Client()
vector_store = chroma_client.get_or_create_collection(name="Articles",
                                                  embedding_function=embedding_function)
vector_store.add(ids=id_list, documents=text_content_list,metadatas=metadata_list)

Context retrieval: Retrieve context of query using query vector to search vector database and identify relevant documents based on vector similarity to the query. ?

results = vector_store.query(
        query_texts=query,
        n_results=2
    )

Query to LLM:?Retrieved documents shape the generation module's context, influencing the final output. This step involves feeding the enriched prompt, informed by the retrieved context, into the LLM for the final output generation.

import google.generativeai as genai
message = [{"role": "user",
"parts": [
          f"We have provided context information below. \n"
          f"---------------------\n"
          f"{results}"
          f"\n---------------------\n"
          f"Given this information, please answer the question:{query}"
                ]
          }]
          model = genai.GenerativeModel('gemini-pro')
          response = model.generate_content(message)
          response_txt = response.text

LLM Challenges and how RAG solves them:?

Large Language Models (LLMs) are like super-smart computers that write human-like text, but they have some problems. Here's Retrieval-Augmented Generation (RAG) helps:?

Hallucination:?

LLMs: Sometimes, they imagine things that aren't real facts.?
RAG: Stops this by bringing in real information from outside, to keep it honest.??

Costly and time-consuming:?

LLMs: It's expensive and takes a long time to make them work for specific jobs.?
RAG: Saves time and money by using smart helper to find the right info without a ton of work.?

Text repetition:?

LLMs: They often write text that's a bit dull and repeats a lot.?
RAG: Makes things interesting by using different sources, so the text is more varied and exciting.?

Info can be outdated:?

LLMs: Sometimes, they're not up to date with the latest news.??
RAG: Solves this by grabbing the latest info from the internet, making sure everything is current and correct.?

Nicolò Magnanini

CEO and co-founder at Pigro

9 个月

Nice! Just one point: there are today more sophisticated chunking solutions that split documents based on the original document layout and content semantics, like https://preprocess.co . This way you can have optimal chunks of text for each document type.

Wajeeha Ayaz

9 个月

Very helpful- looking forward for more ????

1 次回应

John Carlo G. Cardenas ??

Done-for-You Client Acquisition Engine for Coaches & Consultants using Email & Linkedin ?? ? 5+ New Clients GUARANTEED in 90 Days ? LinkedIn? Selling Expert

10 个月

Looking forward to diving into your article on RAG, it sounds like a game-changer! ??

1 次回应

查看更多评论

要查看或添加评论，请登录

查看全部

What is RAG and Why Should You Care?

Aymen Noor

IT Graduate | GenAI & NLP Enthusiast | Machine Learning | Open-Source Contributor | 14X Hackathon Participant | 2X Winner |Section Leader at Stanford Code in Place

Steps in Crafting a Simple RAG with Chroma: ???

领英推荐

LLM Challenges and how RAG solves them:?

更多精彩文章

社区洞察

其他会员也浏览了

Advanced Retrieval-Augmented Generation (RAG) for LLMs: Transforming Enterprise Data from SAP, Workday, Salesforce, etc. into Context-Aware Insights

Qwen 2.5 — Is it better than GPT-4o?

Solving Complex Problems Using FastAPI, LangChain, and GPT-4 Enhanced by OCR and Graph-Based Tools

Evaluating LLM and RAG Systems

Retrieval-Augmented Generation (RAG): A Crucial Tool for Creating LLM Models

Are Long-LLMs A Necessity For Long-Context Tasks?

Steps to Build a Large Language Model (LLM)

LLM FINE-TUNING STRATEGIES FOR DOMAIN-SPECIFIC APPLICATIONS - A DEEP DIVE

Paper Review: Think before you speak: Training Language Models With Pause Tokens

Steps in Crafting a Simple RAG with Chroma: ???

领英推荐

LLM Challenges and how RAG solves them:?

RAGs to Riches: Precision Evaluation for RAG Success

2024年2月24日

How to create custom embedding_ function for chroma:

2024年1月29日

Snort Configuration:

2023年12月13日

Title: Unraveling the MGM Cyber Attack: A Cyber Kill Chain Analysis

2023年12月10日

社区洞察

其他会员也浏览了

Advanced Retrieval-Augmented Generation (RAG) for LLMs: Transforming Enterprise Data from SAP, Workday, Salesforce, etc. into Context-Aware Insights

Qwen 2.5 — Is it better than GPT-4o?

Solving Complex Problems Using FastAPI, LangChain, and GPT-4 Enhanced by OCR and Graph-Based Tools

Evaluating LLM and RAG Systems

Retrieval-Augmented Generation (RAG): A Crucial Tool for Creating LLM Models

Are Long-LLMs A Necessity For Long-Context Tasks?

Steps to Build a Large Language Model (LLM)

LLM FINE-TUNING STRATEGIES FOR DOMAIN-SPECIFIC APPLICATIONS - A DEEP DIVE

Paper Review: Think before you speak: Training Language Models With Pause Tokens