Building Your First RAG System: A Practical Guide for (New) Product Managers
After exploring RAG's potential and seeing its success stories, many of you have asked for a practical starting point. Today, we'll build a simple but powerful RAG system that helps answer questions about your product documentation - a common use case that provides immediate value.
The Use Case: Documentation Q&A
We'll build a system that can:
- Answer questions about your product documentation
- Provide relevant source links
- Handle multiple document formats
- Run completely free using open-source tools
What You'll Need
All tools mentioned are free for individual use:
- Python 3.8+ (https://www.python.org/downloads/)
- Visual Studio Code (https://code.visualstudio.com/)
- A free Hugging Face account (https://huggingface.co/)
- A free ChromaDB account (https://www.trychroma.com/)
Step 1: Setting Up Your Environment
A. Install Required Software
Install Visual Studio Code (VS Code):
- Go to https://code.visualstudio.com/
- Click the download for your operating system (Windows/Mac/Linux)
- Run the installer you downloaded
- Open VS Code after installation
Install Python:
- Visit https://www.python.org/downloads/
- Click "Download Python 3.x.x" (get the latest version)
- Run the installer Important: Check the box that says "Add Python to PATH" during installation
- Click "Install Now"
B. Create Your Project
Create a project folder:
- Open VS Code
- Click "File" → "Open Folder"
- Create a new folder named "docs-rag" somewhere easy to find (like Documents)
- Select this new folder and click "Select Folder"
- If VS Code asks "Do you trust the authors of the files in this folder?" click "Yes"
Open VS Code's terminal:
- Click "Terminal" in the top menu
- Click "New Terminal"
- You'll see a terminal panel appear at the bottom of VS Code
C. Set Up Python Environment
Create a virtual environment (this keeps your project dependencies separate):
- In the terminal, type these commands:
# For Windows:
python -m venv venv
.\venv\Scripts\activate
# For Mac/Linux:
python3 -m venv venv
source venv/bin/activate
- You'll know it worked when you see (venv) at the start of your terminal line
Install required packages:
- In the same terminal, type:
pip install langchain chromadb sentence-transformers pypdf markdown-it-py
- Wait for all installations to complete (this might take a few minutes)
D. Set Up Project Structure
Create project folders:
- In VS Code's file explorer (left sidebar), right-click in empty space
- Click "New Folder" and name it "docs"
- Create another folder named "src"
Create required files:
E. Get Your Hugging Face Token
Create a Hugging Face account:
- Go to https://huggingface.co/
- Click "Sign Up" and complete the process
- After signing in, click your profile picture
- Click "Settings"
- Click "Access Tokens" in the left sidebar
- Click "New token"
- Give it a name (like "rag-project")
- Select "read" for role
- Click "Generate token"
- Copy your token (you'll need it later)
Set up your token:
- In VS Code, create a new file called ".env" in your main folder
- Add this line (replace with your actual token):
HUGGINGFACE_API_TOKEN=your_token_here
- Save the file
领英推è
F. Test Your Setup
Verify Python installation:
python --version # Should show Python 3.x.x
Verify package installation:
pip list # Should show langchain, chromadb, etc.
If you see any errors:
- Make sure Python is in your PATH
- Try closing and reopening VS Code
- Ensure your virtual environment is activated (you should see (venv) in terminal)
Step 2: Project Structure
Create / Ensure this simple structure exists on vs code:
docs-rag/
├── docs/ # Your documentation files
├── src/
│ ├── __init__.py
│ ├── ingest.py # Document processing
│ └── query.py # Query handling
└── main.py # Main application
Step 3: Document Ingestion
Add this code within src/ingest.py:
from langchain.document_loaders import DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma
import os
def load_documents(docs_dir):
"""Load documents from directory"""
loader = DirectoryLoader(
docs_dir,
glob="**/*.md", # Load markdown files, add more patterns as needed
show_progress=True
)
documents = loader.load()
return documents
def split_documents(documents):
"""Split documents into chunks"""
splitter = RecursiveCharacterTextSplitter(
chunk_size=500,
chunk_overlap=50
)
splits = splitter.split_documents(documents)
return splits
def create_vectorstore(splits):
"""Create and save vector store"""
embeddings = HuggingFaceEmbeddings(
model_name="sentence-transformers/all-MiniLM-L6-v2"
)
vectorstore = Chroma.from_documents(
documents=splits,
embedding=embeddings,
persist_directory="chroma_db"
)
return vectorstore
def main():
docs = load_documents("docs")
splits = split_documents(docs)
vectorstore = create_vectorstore(splits)
print(f"Processed {len(splits)} document chunks")
if __name__ == "__main__":
main()
Step 4: Query Processing
Add this code within src/query.py:
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langchain.llms import HuggingFaceHub
import os
def setup_qa_chain():
"""Set up the question-answering chain"""
# Initialize embeddings
embeddings = HuggingFaceEmbeddings(
model_name="sentence-transformers/all-MiniLM-L6-v2"
)
# Load the vector store
vectorstore = Chroma(
persist_directory="chroma_db",
embedding_function=embeddings
)
# Initialize language model
llm = HuggingFaceHub(
repo_id="google/flan-t5-base",
model_kwargs={"temperature": 0.5, "max_length": 512}
)
# Create QA chain
qa_chain = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
retriever=vectorstore.as_retriever(search_kwargs={"k": 3})
)
return qa_chain
def answer_question(qa_chain, question):
"""Get answer for a question"""
try:
result = qa_chain({"query": question})
return result["result"]
except Exception as e:
return f"Error processing question: {str(e)}"
Step 5: Main Application
Add this code within main.py:
from src.ingest import main as ingest_docs
from src.query import setup_qa_chain, answer_question
import os
def main():
# First time setup
if not os.path.exists("chroma_db"):
print("First-time setup: Processing documents...")
ingest_docs()
# Set up QA chain
qa_chain = setup_qa_chain()
# Simple CLI interface
print("\nWelcome to Documentation Q&A!")
print("Type 'quit' to exit")
while True:
question = input("\nWhat's your question? ")
if question.lower() == 'quit':
break
answer = answer_question(qa_chain, question)
print("\nAnswer:", answer)
if __name__ == "__main__":
main()
Step 6: Using the System
Add your documentation:
- Create a docs folder in your project
- Add your markdown documentation files
- Supported formats: .md, .txt (extend as needed)
Run the system
python main.py
Example Usage
$ python main.py
First-time setup: Processing documents...
Processed 25 document chunks
Welcome to Documentation Q&A!
Type 'quit' to exit
What's your question? How do I reset my password?
Answer: According to the documentation, you can reset your password by...
Customization Options
Change embedding model:
- In ingest.py and query.py, modify model_name
- Other free options: "sentence-transformers/all-mpnet-base-v2"
Adjust chunk size:
- In ingest.py, modify chunk_size and chunk_overlap
Support more file types:
- Add more loaders in ingest.py
- Example for PDF: PDFLoader from langchain
Common Issues and Solutions
- "Module not found" errors: Check virtual environment activation Verify all packages are installed
- Memory issues with large documents: Reduce chunk size Process documents in batches
- Slow responses: Use a smaller embedding model Reduce the number of retrieved documents
Cost and Scaling Considerations
This setup is completely free for:
- Small to medium documentation (up to ~10,000 pages)
- Individual usage
- Basic question-answering needs
For production use, consider:
- Paid vector database options
- More powerful language models
- Professional hosting solutions
Ready to Scale Up?
I'd love to discuss your specific use case and help you plan a production-ready implementation. Feel free to reach out to schedule a call!
Also I am working on a tool which does all the heavy lifting for you and much more. I would love to hear your feedback on the same.
Resources
- LangChain Documentation: https://python.langchain.com/
- ChromaDB Documentation: https://docs.trychroma.com/
- Hugging Face Documentation: https://huggingface.co/docs
- Sentence Transformers: https://www.sbert.net/
Remember: This is a starting point. The beauty of RAG is that you can start small and scale as needed. The most important thing is to get hands-on experience with a working system.