Get Started with RAG Applications

Get Started with RAG Applications

Generative AI applications are common now-a-days. A major drawback of LLM based AI application is use of pre-trained data. Many time, LLMs don't have latest data and they will not train on private organisation data.

RAG ( Retrieval Augmented Generation) is new trending way to reuse pre-trained models to feed them organization level context by embedding all private documents and sending the prompt with contextual data/documents.

When you use RAG, you don't need to train model again on your data. You can use LLMs to generate response using private data embedded in data stores.

Below is picture of how RAG architecture looks like:



Below are step by step guide to setup RAG applications on your machine. I used MAC Pro M1 16GB for my setup.

Step 1: Download GPT4All Installer from https://gpt4all.io/index.html

Step 2: Download below two models through GPT4All app:

a) LLama 3 Instruct (Meta-Llama-3-8B-Instruct.Q4_0.gguf)

b) Nomic Embed Text V1.5 (nomic-embed-text-v1.5.Q4_0.gguf)


Step 3: Clone my github project from below location : https://github.com/brmeena/rag-app-demo

Step 4: create Python virtual environment using below commands:

python3 -m venv venv

source venv/bin/activate

Step 4: Install all required packages using below command:

pip install -r requirements.txt


Step 5: Feed the local documents to Vector Db. I am using ChromaDB for vector database. Refer file (gpt4all-embed-docs.py). Change path to GPT4All models as per your home directory.

import sys
import traceback
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma
from langchain_community.embeddings import GPT4AllEmbeddings
try:
    loader = PyMuPDFLoader("data/input.pdf")
    data = loader.load()
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
    all_splits = text_splitter.split_documents(data)

    persist_directory = "chroma_db"
    #print(all_splits)
    model_name = "nomic-embed-text-v1.5.Q4_0.gguf"
    gpt4all_kwargs = {'allow_download': 'False','model_path':'/Users/user/Library/Application Support/nomic.ai/GPT4All'}
    vectorstore = Chroma.from_documents(documents=all_splits, embedding=GPT4AllEmbeddings(
        model_name=model_name,
        gpt4all_kwargs=gpt4all_kwargs
    ),persist_directory=persist_directory)        

Run below code to check if embedding is successful or not using below code:

question = "Who provides revenue management software?"
docs = vectorstore.similarity_search(question)        

You will get relevant documents in output like below:

Step 6: Use Embeddings to enrich LLMs for Generative AI. Refer file (gpt4all-rag.py)

Use below code to get retriever and connect it with LLM ( I am using Llama 3 LLM here)

import sys
import traceback
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma
from langchain_community.embeddings import GPT4AllEmbeddings
from langchain_community.llms import GPT4All
from langchain.chains import ConversationalRetrievalChain
try:
    persist_directory = "chroma_db"
    model_name = "nomic-embed-text-v1.5.Q4_0.gguf"
    gpt4all_kwargs = {'allow_download': 'False','model_path':'/Users/user/Library/Application Support/nomic.ai/GPT4All'}
    vectorstore = Chroma(persist_directory=persist_directory,embedding_function=GPT4AllEmbeddings(
        model_name=model_name,
        gpt4all_kwargs=gpt4all_kwargs
    ))
    db_retriever=vectorstore.as_retriever()
    docs=db_retriever.invoke("RFID Experience for User")
    #print(docs)
    llm = GPT4All(model="/Users/user/Library/Application Support/nomic.ai/GPT4All/Meta-Llama-3-8B-Instruct.Q4_0.gguf")
    conversation = ConversationalRetrievalChain.from_llm(
        llm,
        retriever=db_retriever,
        return_source_documents=True,
        verbose=False,
    )
    response = conversation({"question": "How Rategain help in revenue management? Which company benefitted using it?","chat_history":[]})
    #print(response)
    if 'answer' in response:
        print(response['answer'])
    else:
        print("no answer")
except Exception as e:
    traceback.print_exc(e)        

You should see below output from LLMs using it:

If you face any issue, feel free to connect with me on LinkedIn at Bhola Meena


#rag #llm #generativeai #llama3 #langchain #chromadb #gpt4all


要查看或添加评论,请登录

Bhola Meena的更多文章

  • Myth about EdTech monetization

    Myth about EdTech monetization

    OnlineTyari.com helps students crack Government recruitment exam.

    8 条评论

社区洞察

其他会员也浏览了