登录查看更多内容

Get Started with RAG Applications

Bhola Meena

Tech - AI | Ex-Microsoft | IIT Kanpur

发布日期: 2024年6月13日

Generative AI applications are common now-a-days. A major drawback of LLM based AI application is use of pre-trained data. Many time, LLMs don't have latest data and they will not train on private organisation data.

RAG ( Retrieval Augmented Generation) is new trending way to reuse pre-trained models to feed them organization level context by embedding all private documents and sending the prompt with contextual data/documents.

When you use RAG, you don't need to train model again on your data. You can use LLMs to generate response using private data embedded in data stores.

Below is picture of how RAG architecture looks like:

Below are step by step guide to setup RAG applications on your machine. I used MAC Pro M1 16GB for my setup.

Step 1: Download GPT4All Installer from https://gpt4all.io/index.html

Step 2: Download below two models through GPT4All app:

a) LLama 3 Instruct (Meta-Llama-3-8B-Instruct.Q4_0.gguf)

b) Nomic Embed Text V1.5 (nomic-embed-text-v1.5.Q4_0.gguf)

Step 3: Clone my github project from below location : https://github.com/brmeena/rag-app-demo

Step 4: create Python virtual environment using below commands:

python3 -m venv venv

source venv/bin/activate

领英推荐

Artificial Intelligence #260

Step 4: Install all required packages using below command:

pip install -r requirements.txt

Step 5: Feed the local documents to Vector Db. I am using ChromaDB for vector database. Refer file (gpt4all-embed-docs.py). Change path to GPT4All models as per your home directory.

import sys
import traceback
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma
from langchain_community.embeddings import GPT4AllEmbeddings
try:
    loader = PyMuPDFLoader("data/input.pdf")
    data = loader.load()
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
    all_splits = text_splitter.split_documents(data)

    persist_directory = "chroma_db"
    #print(all_splits)
    model_name = "nomic-embed-text-v1.5.Q4_0.gguf"
    gpt4all_kwargs = {'allow_download': 'False','model_path':'/Users/user/Library/Application Support/nomic.ai/GPT4All'}
    vectorstore = Chroma.from_documents(documents=all_splits, embedding=GPT4AllEmbeddings(
        model_name=model_name,
        gpt4all_kwargs=gpt4all_kwargs
    ),persist_directory=persist_directory)

Run below code to check if embedding is successful or not using below code:

question = "Who provides revenue management software?"
docs = vectorstore.similarity_search(question)

You will get relevant documents in output like below:

Step 6: Use Embeddings to enrich LLMs for Generative AI. Refer file (gpt4all-rag.py)

Use below code to get retriever and connect it with LLM ( I am using Llama 3 LLM here)

import sys
import traceback
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma
from langchain_community.embeddings import GPT4AllEmbeddings
from langchain_community.llms import GPT4All
from langchain.chains import ConversationalRetrievalChain
try:
    persist_directory = "chroma_db"
    model_name = "nomic-embed-text-v1.5.Q4_0.gguf"
    gpt4all_kwargs = {'allow_download': 'False','model_path':'/Users/user/Library/Application Support/nomic.ai/GPT4All'}
    vectorstore = Chroma(persist_directory=persist_directory,embedding_function=GPT4AllEmbeddings(
        model_name=model_name,
        gpt4all_kwargs=gpt4all_kwargs
    ))
    db_retriever=vectorstore.as_retriever()
    docs=db_retriever.invoke("RFID Experience for User")
    #print(docs)
    llm = GPT4All(model="/Users/user/Library/Application Support/nomic.ai/GPT4All/Meta-Llama-3-8B-Instruct.Q4_0.gguf")
    conversation = ConversationalRetrievalChain.from_llm(
        llm,
        retriever=db_retriever,
        return_source_documents=True,
        verbose=False,
    )
    response = conversation({"question": "How Rategain help in revenue management? Which company benefitted using it?","chat_history":[]})
    #print(response)
    if 'answer' in response:
        print(response['answer'])
    else:
        print("no answer")
except Exception as e:
    traceback.print_exc(e)

You should see below output from LLMs using it:

If you face any issue, feel free to connect with me on LinkedIn at Bhola Meena

#rag #llm #generativeai #llama3 #langchain #chromadb #gpt4all

要查看或添加评论，请登录

Bhola Meena的更多文章

Myth about EdTech monetization

2018年8月15日

Myth about EdTech monetization

OnlineTyari.com helps students crack Government recruitment exam.

8 条评论

Get Started with RAG Applications

Bhola Meena

Tech - AI | Ex-Microsoft | IIT Kanpur

领英推荐

Bhola Meena的更多文章

社区洞察

其他会员也浏览了

Artificial Intelligence #83

Introducing real-world AI benchmarks for AGI progress

AIM Weekly - 21 October 2024

FLaNK AI Weekly 18 March 2024

AI in 2024 - some predictions

Databricks-Mosaic AI Research's DBRX & AI21's Jamba-SSM-Transformer Model

Shrink Your Embeddings: Slashing Costs with MRL and BQL

Taking your RAG pipelines to a next level ! LangGraphs

Developing an AI Pipeline for Log Analysis Using Generative AI - Anomaly Detection

An Analysis of DeepSeek’s R1-Zero and R1: Why R1-Zero Outshines R1

领英推荐

Bhola Meena的更多文章

Myth about EdTech monetization

社区洞察

其他会员也浏览了

Artificial Intelligence #83

Introducing real-world AI benchmarks for AGI progress

AIM Weekly - 21 October 2024

FLaNK AI Weekly 18 March 2024

AI in 2024 - some predictions

Databricks-Mosaic AI Research's DBRX & AI21's Jamba-SSM-Transformer Model

Shrink Your Embeddings: Slashing Costs with MRL and BQL

Taking your RAG pipelines to a next level ! LangGraphs

Developing an AI Pipeline for Log Analysis Using Generative AI - Anomaly Detection

An Analysis of DeepSeek’s R1-Zero and R1: Why R1-Zero Outshines R1