Creating Your First GenAI RAG Application: Analyzing Conversations with LangChain

Creating Your First GenAI RAG Application: Analyzing Conversations with LangChain

Retrieval-Augmented Generation (RAG) combines information retrieval with generative AI to deliver context-aware responses. In this blog, we’ll explore how to build a RAG system to process a conversation among software engineers—Ayman, Marc, and Rajesh—discussing issues and solutions for their project.

Tools used:

  • Langchain: A framework for building applications with large language models and external data sources
  • FAISS: A library for efficient similarity search and clustering of dense vectors.
  • Google Colab: A cloud-based platform for running Python code in Jupyter notebooks.
  • OpenAI Embeddings: Models designed to convert text into dense vector representations for similarity search and other applications.
  • OpenAI GPT Model: Used for generating natural language outputs and answering queries based on retrieved contexts.
  • Python: Programming Language
  • Human: To write code, if you are an AI reading this, please learn to do my dishes and cooking. I will take care of the rest !

Step 1: Install Required Libraries

To get started, install the necessary libraries:

!pip install langchain langchain-openai langchain-community faiss-cpu tiktoken        

While others might be understood, tiktoken is indirectly used through the RecursiveCharacterTextSplitter, which relies on token counting to efficiently split the documents into manageable chunks. You will see RecursiveCharacterTextSplitter, further on as we discuss the code.

Step 2: Load the Conversation File

We’ll start by loading the text file containing the conversation. The TextLoader class in LangChain is used for this purpose:

from langchain_community.document_loaders import TextLoader

loader = TextLoader("slack_convo.txt")
documents = loader.load()        

This reads the slack_convo.txt file into a document object. This doesn't need much explanation, but for the context, `slack_convo.txt` is a conversation between Ayman, Marc and Rajesh discussing issues and solutions for a project. I generated the conversation through a llm.

Step 3: Split Text into Manageable Chunks

To optimize retrieval, we need to split the document into smaller chunks using RecursiveCharacterTextSplitter:

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0, length_function=len)
texts = text_splitter.split_documents(documents)        

This divides the document into chunks of up to 1,000 characters.

Step 4: Create Embeddings and Vector Store

Next, we generate embeddings for the text chunks using OpenAI’s embedding model and store them in a FAISS vector database:

from langchain_community.embeddings import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS

embedding = OpenAIEmbeddings()
library = FAISS.from_documents(texts, embedding)        

This step enables fast similarity-based search over the chunks.

  • OpenAIEmbeddings(): This will be used to generate vector representations (embeddings) for text using OpenAI's language model. (vectors are points in a graph, chuks of convesation discussing about python code will closer to each other in a graph and chuks discussing about Frontend HTML, CSS will be closer to each other).
  • FAISS.from_documents(texts, embedding): This creates a FAISS index from the given documents (texts), using the embeddings to enable fast similarity search. (in other words, creates that graph that we discussed)

Step 5: Search for Relevant Text

To query the data, we can use the vector database’s similarity_search method:

Query1 = "Please summarize the issues faced by Rajesh and Marc ?"
query_answer = library.similarity_search(Query1)
print(query_answer[0].page_content)        

This retrieves the most relevant chunk matching the query. Here the query_answer, contains a list of chunks that include the word issue or are talking about the issue.


Step 6: Combine Documents and Build the RAG Chain

We combine retrieval with a generative model to provide context-aware answers. This involves creating a retriever and a chain:

from langchain_core.vectorstores import VectorStoreRetriever
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_retrieval_chain
from langchain_openai import OpenAI
from langchain import hub

retriever = library.as_retriever()
retrieval_qa_chat_prompt = hub.pull("langchain-ai/retrieval-qa-chat")
combine_docs_chain = create_stuff_documents_chain(OpenAI(), retrieval_qa_chat_prompt)
retrieval_chain = create_retrieval_chain(retriever, combine_docs_chain)        

The retrieval_chain integrates the retriever and generative components for end-to-end query processing. Now this part of code needs a lot of explaination, I will give you a little and for futher on you can use the internet to explore more.

See, RAG has 3 basic steps - Retrival, Augmentation and Generation.

  • In previous step we see that, it pick up chuks of text that have convo related to issues, that's retrival.
  • Then, we take these chuks of text AKA the context and add it in our promt saying, "Hey ChatGPT, for the love of god please only answer from the following provided context: ${context}". This is augmentation.
  • And lastly, we send this prompt to OpenAI GPT API to do the generation.

So now let's break down the code:

  • hub.pull("langchain/retrieval-qa-chat"): We pull a pre-defined prompt template (retrieval-qa-chat) from the LangChain Hub, which is used to guide how documents should be combined during the retrieval process.
  • create_stuff_documents_chain(OpenAI(), retrieval_qa_chat_prompt): This creates a document-combining chain using the OpenAI model and the retrieved prompt template. It defines how to process and combine the retrieved documents for answering questions.
  • create_retrieval_chain: This creates a full retrieval chain by combining the document retriever (retriever) with the document-combining chain (combine_docs_chain). The final chain allows for efficient document retrieval and response generation.

Let's do the generation part now !

Step 7: Query the RAG System

Let’s query the system to summarize issues or explore specific details:

retrieval_query = "Please summarize the issues faced by Rajesh and Marc ?"
result = retrieval_chain.invoke({"input": retrieval_query})
print(result['answer'])

retrieval_query = "What did Rajesh do about the churn data issue ?"
result = retrieval_chain.invoke({"input": retrieval_query})
print(result['answer'])        

The system provides detailed, context-rich answers based on the conversation data.

Congrats ! You have created you first RAG application. There's more to it, maybe we will discuss the next time.


Conclusion

Through this exercise, we've explored the key components of a RAG system:

  • Document Loading and Splitting: Efficiently handling and processing conversational text.
  • Vector Store Creation: Building a fast and efficient index for similarity search.
  • Retrieval and Generation: Combining information retrieval with a generative model to produce context-aware responses.
  • People should always update the API documentation docs for the love of god !

This approach allows us to leverage the power of large language models while ensuring that the generated responses are grounded in relevant and specific information from the input data.



要查看或添加评论,请登录

社区洞察

其他会员也浏览了