登录查看更多内容

Embracing the Future of Natural Language Processing - RAG

Babloo Kumar

Principal Product Developer | JavaScript | Python | Angular | Azure Dev Stack | Artificial Intelligence | Computer Vision | Algorithms | Blazor | React | NodeJs | TypeScript | Carbon Footprint

发布日期: 2024年7月15日

Retrieval-Augmented Generation (RAG) systems are transforming how we approach NLP by seamlessly integrating retrieval and generation models. This innovative approach bridges the gap between factual accuracy and language fluency, opening new frontiers in dialogue systems, question answering, and content generation.

Fig1. Diagram of the RAG System (Reference image is taken from the internet)

Let’s break down how RAG systems work using a simple example: Imagine a chatbot querying a private knowledge base using langchain (Orchestrator), AzureOpenAI for embeddings & GPT-4 models, and FAISS for local vector DB.

1?? Ingestion Phase: First, we store documents in the vector database by:

Loading documents
Splitting them into chunks
Creating embeddings
Storing these embeddings in the vector database

# load the document --> split the document into chunks --> create embeddings --> store in vector database

领英推荐

Unveiling Google’s Gemini 2.0: A Comprehensive Study…

Anand Ramachandran 3 个月前

???? AI Cutting Research Costs by 84%

Pascal Biese 2 个月前

Parameter-Efficient Fine-Tuning (PEFT): Fine-Tuning…

Sanjay Kumar MBA,MS,PhD 3 个月前

#Split the entire knowledge base or collection of documents in chunks, chunks basically represents the single piece of context to be queried.

 pdf_path = "Vectors-in-memory/2210.03629v3.pdf"
    loader = PyPDFLoader(file_path=pdf_path)
    documents = loader.load()
    text_splitter = CharacterTextSplitter(
        chunk_size=1000, chunk_overlap=30, separator="\n"
    )
    docs = text_splitter.split_documents(documents=documents)

# Use embedding model to transform each chunks into vector embeddings.
embeddings = AzureOpenAIEmbeddings(
    azure_deployment=os.environ["AZURE_EMBEDDING_OPENAI_DEPLOYMENT"],
    openai_api_key=os.environ["AZURE_EMBEDDING_OPENAI_KEY"],
    azure_endpoint=os.environ["AZURE_EMBEDDING_OPENAI_API_BASE"],
    openai_api_type=os.environ["AZURE_EMBEDDING_OPENAI_API_TYPE"],
    openai_api_version=os.environ["AZURE_EMBEDDING_OPENAI_API_VERSION"],
 )

#Store all the vector embeddings into the vector database.
vectorstore = FAISS.from_documents(docs, embeddings)
vectorstore.save_local("faiss_index_react")

2?? Retrieval Phase: When a query (or chatbot question) comes in, the system retrieves relevant information by:

Embedding the query
Using Approximate Nearest Neighbor (ANN) to find similar embeddings in the vector DB
Returning relevant contexts or vectors

#convert the query(text) into embedding --> pass it to vector db --> map the results (context) to the prompt --> pass the prompt (query + context) to LLM

# Embed the query(which is asked from the chatbot) using the same embedding model.  
chat = AzureChatOpenAI(
        openai_api_key="xxxxxxd8a815xxxxxxxxxxxxx",
        azure_endpoint="https://testmodel.openai.azure.com/",
        openai_api_type="azure",
        azure_deployment="GPT4",
        openai_api_version="2024-05-01-preview",
        temperature=0,
        max_tokens=None,
        timeout=None,
        max_retries=2,
    )
   #Let below be the query from the chatbot
   query = "What is machine learning?"

   new_vectorstore = FAISS.load_local(
      "faiss_index_react",embeddings,allow_dangerous_deserialization=True
   )

   template = """Use the following pieces of context to answer the question at the end.
    If you don't know the answer, just say that you don't know, don't try to make up the answer.
    Use three sentences maximum and keep the answer as concise as possible.
    Always say "Thank you for Asking!!!" at the end of the answer.

    {context}

    Question: {question}

    Helpful Answer:"""

  custom_rag_prompt = PromptTemplate.from_template(template)

#The retrieval aspect involves retrieving relevant information or contexts from a large #corpus of text based on a given query or input. This retrieval is typically performed #using information retrieval techniques, where documents or passages most relevant # to the input are selected.

  rag_chain = (
      {"context": new_vectorstore .as_retriever(), "question": RunnablePassthrough() }
      | custom_rag_prompt
      | chat
    )

    res = rag_chain.invoke(query)
    print(res)

3?? Generation Phase: Finally, the retrieved information is combined with the query to generate contextually relevant responses using models like GPT-4.

The synergy of retrieval and generation in RAG systems ensures responses are not only fluent but also grounded in accurate knowledge from the retrieval process.

However, a critical consideration emerges: How eco-friendly is this computational process? As we advance in AI, sustainability becomes pivotal. Let’s discuss how we can innovate responsibly and sustainably in the world of AI and NLP.

Aashish Sinha

Associate Director - SAP Sales & Solution| Digital Transformation Leader | Product & IT Strategy Expert | Blockchain & Emerging Technologies Innovator | Driving Growth through Leadership & Innovation

8 个月

Interesting!

2 次回应

要查看或添加评论，请登录

Babloo Kumar的更多文章

Time Complexity

2023年8月9日

Time Complexity

In the programming, when we have a problem, there are many ways to solve it. But which is the best way and how do we…

2 条评论
Space Complexity

2020年5月27日

Space Complexity

Whenever we write an algorithm or we want to perform analysis of an algorithm, we need to calculate the complexity of…

5 条评论

Embracing the Future of Natural Language Processing - RAG

Babloo Kumar

Principal Product Developer | JavaScript | Python | Angular | Azure Dev Stack | Artificial Intelligence | Computer Vision | Algorithms | Blazor | React | NodeJs | TypeScript | Carbon Footprint

领英推荐

Babloo Kumar的更多文章

社区洞察

其他会员也浏览了

Artificial Intelligence: A Double-Edged Sword in the World of Information and Misinformation

We need to Rethink Chain-of-Thought (CoT) prompting - AI&YOU #68

A Few Thoughts on GPT-4 For Ai Code Generation

Understanding Embeddings

DeepSeek vs. OpenAI: A Comparative Overview of Two AI Giants

Artificial Intelligence: Creativity Accelerator and Revolution in Narrative Creation?

A quick deep dive into recent AI tools

Trending LLMs of 2025 that are making a significant impact this year....

Revolutionizing Scientific Research: How Advanced AI Models Like GPT-5 Are Transforming R&D

GPT-3's Impressive Analogical Reasoning Skills: A Comparison with Human Reasoners

领英推荐

Babloo Kumar的更多文章

Time Complexity

Space Complexity

社区洞察

其他会员也浏览了

Artificial Intelligence: A Double-Edged Sword in the World of Information and Misinformation

We need to Rethink Chain-of-Thought (CoT) prompting - AI&YOU #68

A Few Thoughts on GPT-4 For Ai Code Generation

Understanding Embeddings

DeepSeek vs. OpenAI: A Comparative Overview of Two AI Giants

Artificial Intelligence: Creativity Accelerator and Revolution in Narrative Creation?

A quick deep dive into recent AI tools

Trending LLMs of 2025 that are making a significant impact this year....

Revolutionizing Scientific Research: How Advanced AI Models Like GPT-5 Are Transforming R&D

GPT-3's Impressive Analogical Reasoning Skills: A Comparison with Human Reasoners