登录查看更多内容

Improving AI Contextual Understanding -Retrieval Augmented Generation (RAG)

Vijay Chaudhary

Lead Software Engineer

发布日期: 2025年1月4日

Retrieval-Augmented Generation (RAG) is a technique in natural language processing that uses knowledgebase information retrieval?to create accurate and relevant responses. RAG begins with a user posing a question. Instead of relying on pre-trained data, the system first tries to retrieve relevant info from a knowledge base. This retrieved data is then combined with the user's query to generate an informed and specific response. By integrating external knowledge sources during the generation process, RAG improves the precision of AI-powered applications and reduces hallucinations.?

RAG is best utilized when processing large datasets that cannot fit within the LLM’s context window or when?the ability to display retrieved documents is essential. However, RAG may not be necessary for scenarios involving small datasets that can fit within the LLM's context window or for tasks like single-document analysis or summarization, where latency is not a critical factor. Modern RAG implementations benefit from frameworks such as LangChain and LlamaIndex, which can simplify the development process. These frameworks provide user-friendly tools and classes, reducing the need to build complex systems from scratch.?

Without RAG, it will be simple API call to LLMs, where user query will be as is to LLMs for response (System queries can be added to instruct LLMs on certain aspects).

[Benefits]?

[1] Accuracy Improvements - Enhances LLMs response by referring to domain-specific knowledgebase.?

[2] Reduce Hallucination - Minimizes false information by retrieving relevant external data.?

[3] External Knowledgebases - Can be integrated with private documentation, PDFs, codebases, or SQL databases.?

[4] Verification of Responses - Allows to validate the source of information.?

[Challenges]?

[1] Latency – May increase due to processing larger knowledge bases.?

[2] Irrelevant Data – Risks of irrelevant info in dataset. ?

[3] Higher Costs – Execution costs can go higher.?

In this article, we?will?explore Retrieval-Augmented Generation (RAG) based approach using domain-specific datasets. We will try to achieve contextually relevant answers with our own dataset. Here focus is on leveraging a water treatment dataset, transforming raw documents into embeddings, and utilizing cosine similarity techniques to identify relevant chunks of information. By incorporating these chunks into a conversational AI system powered by the Gemini model, we aim to demonstrate how LLMs can be augmented with custom knowledge bases to minimize hallucinations. Through this step-by-step guide, let's explore topics such as chunking, embedding generation, and similarity-based retrieval. ?

[1] Install necessary Python libraries and add imports.

[2] Initializes the tokenizer and model for the pre-trained "facebook/opt-125m" model using the Hugging Face Transformers library.?

[3] Load your dataset into Pandas DataFrame for further analysis and processing. Our Dataset contains detailed information about water treatment proposals, including technical specifications, maintenance requirements, and commercial terms from multiple vendors. Data is kept in CSV file with four columns – Title, Content, URL, Source?

领英推荐

How to Bypass GPTZero: 12 Proven Techniques to Beat AI…

Parul Gautam 8 个月前

Top 10 Undetectable AI Tools to Bypass AI Detection…

Shushant Lakhyani 9 个月前

The Role of AI Search Engines in Developer Productivity

Blockchain Council 1 年前

[4] Write methods to split the quote text into small chunks based on sentences. Chunk data should look something like this (these chunks are for second quote)?

[5] Create chunks and embeddings (using the tokenizer). Generate embeddings for a given text using OPT-125M. ?

Put the chunks and embeddings into a new data frame for easier access. ?

[6] Add logic to retrieve?most relevant chunks of text based on a user's query by calculating cosine similarity between the query embedding and precomputed embeddings in the dataset. Also?combine relevant chunks into a single string to pass into context prompt.?

[7] Build the context prompt?with user query and most relevant chunks.

[8] Call the Generative AI model to generate the response. I am using Gemini-1.5, you can use other alternative depending on your access and API key availability. ?

With RAG based approach, generated Answer: Global Water Solutions' maintenance involves regular regeneration (frequency depends on water usage and hardness, e.g., every 120,000 liters for one system). Salt (120-190 kg per regeneration) is needed for this process. The brine tank should be cleaned before each regeneration. Pre-filter mesh cleaning is also recommended, depending on feed water dust levels. Additionally, adhering to the post-installation maintenance manual is crucial. ? ?

[9] If we do this without passing chunks and relevant dataset – model will give generic response or response without fact, basically tends to create response and could tend to hallucinate. ?

Without RAG based approach, generated Answer: Global Water Solutions apartment water treatment systems require varying maintenance depending on the specific system installed. Generally, this includes regular filter replacements (frequency depends on water quality and usage), periodic inspections of the system components for leaks or damage, and occasional professional servicing for cleaning or component replacement. Specific maintenance schedules and instructions should be provided by Global Water Solutions or found in the system's user manual. ?

From both responses, it is clear that the process of RAG ensures fact-based responses by retrieving relevant chunks of information from a dataset based on the user’s query. By calculating cosine similarity, the most relevant chunks are combined into a single string and passed into the prompt to guide the AI model.? Without RAG, the model may rely on generic knowledge, risking factual inaccuracies or hallucinations, highlighting the importance of leveraging relevant datasets for generating reliable and specific responses.?

?Note - You can use different LLMs and tokenizers as per your convenience and availability.

Summary?

In this article, we explored the application of RAG - using a water treatment quotes dataset, we covered how to transform raw data into embeddings and utilize cosine similarity to identify relevant chunks of information. This approach also integrated chunks into?the Gemini model, showcasing how RAG minimizes hallucinations by ensuring fact-based responses. Through a step-by-step guide, we covered critical aspects like chunking, embedding generation, and similarity-based retrieval. By comparing outputs with and without RAG, the importance of leveraging relevant datasets for generating precise, contextually relevant answers - is clear. ?

AI-ML & Automations

1,576 位关注者

Matt Holocher

Solutions Consulting Director, Strategic Engagements at Tungsten Automation

1 个月

Have you tried any of this in the latest versions of TotalAgility? Coming this year we will be taking this even further. Exciting times for sure.

1 次回应

查看更多评论

要查看或添加评论，请登录

Vijay Chaudhary的更多文章

Understanding RAG Evaluation: A Practical Approach to Retrieval Metrics

2025年3月16日

Understanding RAG Evaluation: A Practical Approach to Retrieval Metrics

Retrieval-Augmented Generation (RAG) systems are gaining popularity, helping users find relevant documents to answer…

1 条评论
Splitting Text Right Way - NLTK, SpaCy or Markdown

2025年3月2日

Splitting Text Right Way - NLTK, SpaCy or Markdown

For natural language processing (NLP) working with large pieces of text can be challenging. Many language models have…

1 条评论
Unlocking Entities and Relations: Creating Knowledge Graphs with AI

2025年2月16日

Unlocking Entities and Relations: Creating Knowledge Graphs with AI

GraphRAG is something which is picking up recently, in this article we will try to get to the basics of GraphRag…
Structured Outputs from LLMs: LangChain Output Parsers

2025年2月9日

Structured Outputs from LLMs: LangChain Output Parsers

LLMs are good at generating human-like text (hence called Generative AI), but when it comes to integrating to…
Handling Sensitive Data: Redaction, Masking and Compliance

2025年2月2日

Handling Sensitive Data: Redaction, Masking and Compliance

In today's data-driven world, digital documents containing sensitive information pose challenges to privacy and…
Optimizing AI Workflows with LangChain - A Practical Introduction

2025年1月25日

Optimizing AI Workflows with LangChain - A Practical Introduction

LangChain is a framework for developing applications powered by large language models (LLMs). It helps in simplifying…
Effortlessly Organize Mixed Documents with GCP's Custom Splitter Feature

2025年1月19日

Effortlessly Organize Mixed Documents with GCP's Custom Splitter Feature

In real-world scenarios, it's common to encounter multiple documents combined into a single, multi-page image or PDF…
Understanding Custom Classifiers in Google Document AI

2024年12月29日

Understanding Custom Classifiers in Google Document AI

There are three categories of models or services in GCP Document AI – General Document processors (Layout, Form and Doc…
Processing with GCP Document AI: Exploring Pretrained Parsers

2024年12月15日

Processing with GCP Document AI: Exploring Pretrained Parsers

GCP Document AI offers multiple products to process documents for information for different use cases. Below…

2 条评论
Custom Document Extractors with Google Document AI

2024年12月8日

Custom Document Extractors with Google Document AI

GCP Document AI broadly has three categories of document extraction models – General Document processors (Layout, Form…

See all articles

Improving AI Contextual Understanding -Retrieval Augmented Generation (RAG)

Vijay Chaudhary

Lead Software Engineer

领英推荐

AI-ML & Automations

1,576 位关注者

Vijay Chaudhary的更多文章

社区洞察

其他会员也浏览了

Our 4-Tool Stack + Strategy for Building Enterprise AI Solutions on LLMs - AI&YOU #53

??February AI Newsletter

How OpenAI's New Model o1's Enhanced Reasoning Capabilities Propel Compound AI Systems to New Levels

Top AI/ML Papers of the Week [08/04 - 14/04]

Inside GPT-5: OpenAI's Next Big Leap in Artificial Intelligence

Friday AI Tools Directory Edition

Build Your Business-specific LLMs Using RAG

Features, Functionalities, and Human Experiences

Retrieval-Augmented Generation (RAG) framework in Generative AI

Large Concept Models = LCMs > LLMs

领英推荐

AI-ML & Automations

1,576 位关注者

Vijay Chaudhary的更多文章

Understanding RAG Evaluation: A Practical Approach to Retrieval Metrics

Splitting Text Right Way - NLTK, SpaCy or Markdown

Unlocking Entities and Relations: Creating Knowledge Graphs with AI

Structured Outputs from LLMs: LangChain Output Parsers

Handling Sensitive Data: Redaction, Masking and Compliance

Optimizing AI Workflows with LangChain - A Practical Introduction

Effortlessly Organize Mixed Documents with GCP's Custom Splitter Feature

Understanding Custom Classifiers in Google Document AI

Processing with GCP Document AI: Exploring Pretrained Parsers

Custom Document Extractors with Google Document AI

社区洞察

其他会员也浏览了

Our 4-Tool Stack + Strategy for Building Enterprise AI Solutions on LLMs - AI&YOU #53

??February AI Newsletter

How OpenAI's New Model o1's Enhanced Reasoning Capabilities Propel Compound AI Systems to New Levels

Top AI/ML Papers of the Week [08/04 - 14/04]

Inside GPT-5: OpenAI's Next Big Leap in Artificial Intelligence

Friday AI Tools Directory Edition

Build Your Business-specific LLMs Using RAG

Features, Functionalities, and Human Experiences

Retrieval-Augmented Generation (RAG) framework in Generative AI

Large Concept Models = LCMs > LLMs