Getting Started with RAG: Building Your Own AI Assistant with Ollama and LangChain
Mahender Reddy Pokala
Research Assistant at UChicago | Building AI Applications | Computer Vision | LLM | Gen AI
Hey there! Ever wished you could build an AI assistant that actually knows about your specific documents and data? That's exactly what we're going to do today using something called RAG (Retrieval-Augmented Generation). Don't worry if that sounds complicated - I'll break it down into simple steps you can follow along.
What We're Building and Why It's Cool
Before we dive in, let me explain what makes this project exciting. You know how ChatGPT is great but sometimes gives outdated or generic answers? With RAG, we're basically giving an AI model access to our own documents, so it can give us precise answers based on our specific information. The best part? We'll be running everything locally on our computer using Ollama (for the AI model) and Lang Chain (for putting everything together).
Setting Up Your Environment
First things first - let's get your computer ready. You'll need to install a few things, but don't worry, it's pretty straightforward:
# If you're on macOS, this is super easy:
brew install ollama
pip install langchain chromadb
# On other systems, check out ollama.ai for installation instructions
https://ollama.com/
The Magic Ingredients: Meet Ollama and LangChain
Ollama is like having your own personal AI model that runs right on your computer. No need to worry about API keys or usage limits! We'll be using a model called Mistral, which is pretty powerful while still being lightweight enough to run locally.
# Download the Mistral model
ollama pull mistral
Building Your RAG Application: The Fun Part
Let's break this down into manageable chunks (pun intended - you'll see why later!).
Step 1: Getting Your Documents Ready
First, we need to prepare your documents. Think of this like teaching your AI assistant by giving it reading material:
from langchain.document_loaders import DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
# Point this at your documents folder
loader = DirectoryLoader('./my_documents', glob="**/*.txt")
documents = loader.load()
# We'll split documents into smaller chunks so they're easier to work with
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000, # Feel free to adjust this
chunk_overlap=200,
length_function=len,
)
splits = text_splitter.split_documents(documents)
Step 2: Creating Your AI's Memory Bank
Now comes the cool part - we're going to create what's essentially a smart filing system for your documents:
from langchain.embeddings import OllamaEmbeddings
from langchain.vectorstores import Chroma
# This is like creating a smart index for your documents
embeddings = OllamaEmbeddings(model="mistral")
vectorstore = Chroma.from_documents(
documents=splits,
embeddings=embeddings,
persist_directory="./my_knowledge_base"
)
Step 3: Building Your Question-Answering System
Here's where we tie everything together:
from langchain.llms import Ollama
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
# Set up your AI assistant
llm = Ollama(model="mistral")
# Create a friendly prompt template
prompt_template = """Hey! I've found some relevant information that might help answer this question.
Let me take a look at it and give you a clear answer.
Here's what I know about this:
{context}
The question was: {question}
Based on this information, here's what I can tell you: """
PROMPT = PromptTemplate(
template=prompt_template,
input_variables=["context", "question"]
)
# Create your QA system
qa_chain = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
retriever=vectorstore.as_retriever(search_kwargs={"k": 3}),
return_source_documents=True,
chain_type_kwargs={"prompt": PROMPT}
)
Step 4: Making It User-Friendly
Let's create a simple way to interact with your new AI assistant:
领英推荐
def ask_question(question: str):
try:
result = qa_chain({"query": question})
print("\n?? You asked:", question)
print("\n?? Here's what I found:", result["result"])
print("\n?? I got this information from:")
for doc in result["source_documents"]:
print(f"- {doc.metadata['source']}")
except Exception as e:
print("Oops! Something went wrong:", str(e))
# Let's make it interactive!
if __name__ == "__main__":
print("?? Hi! I'm your AI assistant. Ask me anything about your documents!")
while True:
question = input("\n? What would you like to know? (type 'quit' to exit): ")
if question.lower() == 'quit':
print("\n?? Goodbye! Have a great day!")
break
ask_question(question)
Making It Even Better
Once you've got the basics working, here are some cool ways to enhance your assistant:
Add Some Memory
Want your assistant to remember your conversation? Here's how:
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain
memory = ConversationBufferMemory(
memory_key="chat_history",
return_messages=True
)
qa_chain = ConversationalRetrievalChain.from_llm(
llm=llm,
retriever=vectorstore.as_retriever(),
memory=memory
)
Speed Things Up
If your assistant feels a bit slow, try these tricks:
Tips from Experience
After building several RAG applications, here are some things I've learned:
Troubleshooting Common Issues
Running into problems? Here are some common issues and fixes:
Wrap Up
And there you have it! You've built your own AI assistant that can actually answer questions about your specific documents. Pretty cool, right?
Remember, this is just the beginning - you can customize and expand this in countless ways. Want to add support for PDFs? Different types of questions? Multiple knowledge bases? The possibilities are endless!
Feel free to experiment and make it your own. After all, the best way to learn is by doing. Happy coding! ??
Resources for Learning More
If you want to dive deeper, check out:
Now go forth and build something awesome! And don't forget to share what you create with others - the AI community loves seeing new projects! ??
AI Advisor and Trainer of Leaders | Investor, Builder, Speaker, Executive Coach
1 个月This is a game-changer for anyone looking to leverage AI without the cost or privacy concerns of cloud APIs. Running Ollama + LangChain locally for RAG applications puts powerful AI-driven document analysis directly in your hands, no vendor lock-in, no API fees, and complete control over your data. The ability to process, index, and interact with custom knowledge bases is huge for researchers, legal teams, and businesses handling sensitive information. Curious, what’s been the biggest challenge in optimizing performance for larger document sets in a fully local setup? Definitely checking out your guide!
Senior Managing Director
1 个月Mahender Reddy Pokala Very insightful. Thank you for sharing