Creating Your First GenAI RAG Application: Analyzing Conversations with LangChain
Asim Junaid
?? Software Engineer - JL5 | Java, Scala, Spring Boot, Flutter, Microservices, Kafka, AWS | Specialist Programmer at Infosys
Retrieval-Augmented Generation (RAG) combines information retrieval with generative AI to deliver context-aware responses. In this blog, we’ll explore how to build a RAG system to process a conversation among software engineers—Ayman, Marc, and Rajesh—discussing issues and solutions for their project.
Tools used:
Step 1: Install Required Libraries
To get started, install the necessary libraries:
!pip install langchain langchain-openai langchain-community faiss-cpu tiktoken
While others might be understood, tiktoken is indirectly used through the RecursiveCharacterTextSplitter, which relies on token counting to efficiently split the documents into manageable chunks. You will see RecursiveCharacterTextSplitter, further on as we discuss the code.
Step 2: Load the Conversation File
We’ll start by loading the text file containing the conversation. The TextLoader class in LangChain is used for this purpose:
from langchain_community.document_loaders import TextLoader
loader = TextLoader("slack_convo.txt")
documents = loader.load()
This reads the slack_convo.txt file into a document object. This doesn't need much explanation, but for the context, `slack_convo.txt` is a conversation between Ayman, Marc and Rajesh discussing issues and solutions for a project. I generated the conversation through a llm.
Step 3: Split Text into Manageable Chunks
To optimize retrieval, we need to split the document into smaller chunks using RecursiveCharacterTextSplitter:
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0, length_function=len)
texts = text_splitter.split_documents(documents)
This divides the document into chunks of up to 1,000 characters.
Step 4: Create Embeddings and Vector Store
Next, we generate embeddings for the text chunks using OpenAI’s embedding model and store them in a FAISS vector database:
from langchain_community.embeddings import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
embedding = OpenAIEmbeddings()
library = FAISS.from_documents(texts, embedding)
This step enables fast similarity-based search over the chunks.
Step 5: Search for Relevant Text
To query the data, we can use the vector database’s similarity_search method:
Query1 = "Please summarize the issues faced by Rajesh and Marc ?"
query_answer = library.similarity_search(Query1)
print(query_answer[0].page_content)
This retrieves the most relevant chunk matching the query. Here the query_answer, contains a list of chunks that include the word issue or are talking about the issue.
领英推荐
Step 6: Combine Documents and Build the RAG Chain
We combine retrieval with a generative model to provide context-aware answers. This involves creating a retriever and a chain:
from langchain_core.vectorstores import VectorStoreRetriever
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_retrieval_chain
from langchain_openai import OpenAI
from langchain import hub
retriever = library.as_retriever()
retrieval_qa_chat_prompt = hub.pull("langchain-ai/retrieval-qa-chat")
combine_docs_chain = create_stuff_documents_chain(OpenAI(), retrieval_qa_chat_prompt)
retrieval_chain = create_retrieval_chain(retriever, combine_docs_chain)
The retrieval_chain integrates the retriever and generative components for end-to-end query processing. Now this part of code needs a lot of explaination, I will give you a little and for futher on you can use the internet to explore more.
See, RAG has 3 basic steps - Retrival, Augmentation and Generation.
So now let's break down the code:
Let's do the generation part now !
Step 7: Query the RAG System
Let’s query the system to summarize issues or explore specific details:
retrieval_query = "Please summarize the issues faced by Rajesh and Marc ?"
result = retrieval_chain.invoke({"input": retrieval_query})
print(result['answer'])
retrieval_query = "What did Rajesh do about the churn data issue ?"
result = retrieval_chain.invoke({"input": retrieval_query})
print(result['answer'])
The system provides detailed, context-rich answers based on the conversation data.
Congrats ! You have created you first RAG application. There's more to it, maybe we will discuss the next time.
Conclusion
Through this exercise, we've explored the key components of a RAG system:
This approach allows us to leverage the power of large language models while ensuring that the generated responses are grounded in relevant and specific information from the input data.