Build AI RAG Chatbot with Ollama and LangChain
Today I will demonstrate in this article how to build your own AI chatbot on your customized dataset using Retrieval-Augmented Generation (RAG).
We will use to develop the RAG chatbot: Ollama to run the Llama 3.1 LLM locally on your device and LangChain framework to build chatbot application.
First of all what is Retrieval-Augmented Generation (RAG) ?
RAG is AI development technique where a large language model (LLM) is connected to an external knowledge base to improve the accuracy and quality of its responses.
In brief RAG technique allow us to feed specific dataset to the LLM and ask questions about it.
Basic RAG Chatbot Pipeline
There are 2 basic steps we need to follow in order to create RAG chatbot:
Build RAG Chatbot Workflow
We can divide our work into these sections:
Install Ollama and run Llama 3.1 locally
we will use Ollama platform to download and run open source LLMs locally on our device. In our project we will use Llama 3.1 8B by Meta.
To install it on Linux OS Run this command:
curl -fsSL https://ollama.com/install.sh | sh
if you are using Windows or macOS you can check installation section here.
we will download and install Llama 3.1 8B parameters which require at least 4.7 GB hard disk space and 8 GB of RAM available to run the model.
Run this command to download Llama 3.1 model on your device
ollama run llama3.1
Then after the model download completed it will start automatically in your terminal and you can chat with it, to get out from chat press CTRL+D
you can check its meta information by running
ollama show llama3.1
Also you can show list of downloaded models on your device
ollama list
Prepare the customized dataset for RAG on Llama 3.1
At this step we will use this public dataset AI-Powered Job Market Insights on Kaggle platform.
The dataset contains the following columns attributes:
We will take only the first 25 records of this dataset for the project demo to minimize the processing time of embedding and indexing the dataset in Vector store locally.
You can download the dataset from Kaggle and save only the first 25 records into CSV file.
领英推荐
Index the dataset into vector store using LangChain
First we should setup our project and install python packages dependencies in our virtual environment
You can view the project structure here:
The requirements file contains the following packages
langchain==0.2.15
langchain-community==0.2.14
langchain-ollama==0.1.2
To install these python packages in your virtual environment run
pip install -r requirements.txt
Now we can load our dataset and index it into a vector store
We can load the dataset using TextLoader method which allow us to load any document file for different data formats such as text (.txt), spread sheets (.csv) ...etc
from langchain_community.document_loaders import TextLoader
# Create a TextLoader object
loader = TextLoader("dataset/ai_job_market_insights_mini.csv")
Here we load our ai_job_market_insights_mini.csv data sheet
Then we need to create embedding object that will be used to index our dataset into vector store. In order to create the object we need to use specific embedding model.
What is Embedding models?
Embedding models create a vector representation of a piece of text. You can think of a vector as an array of numbers that captures the semantic meaning of the text.
In our project we will use OllamaEmbeddings model
from langchain_community.embeddings import OllamaEmbeddings
# Create an OllamaEmbeddings object
embeddings = OllamaEmbeddings(model="llama3.1")
Then we should use VectorstoreIndexCreator wrapper method to start indexing our dataset into vector store. this method by default will use InMemoryVectorStore which will store the dataset index in your device memory during running RAG chatbot and doesn't need setting up. you can also use other vector databases such as Chroma but you have to setup the vector database before running the application.
In this step we will create VectorstoreIndexCreator object and then index our loaded dataset into the vector store
from langchain.indexes import VectorstoreIndexCreator
# Create a VectorstoreIndexCreator object
index_creator = VectorstoreIndexCreator(embedding=embeddings)
# Call from_loaders method
index = index_creator.from_loaders([loader])
print("indexing document in vector store completed!")
The indexing operation will take some time depending on the size of your dataset and the computing resources available in your device.
Setup Llama 3.1 LLM and integrate our indexed dataset with it
Finally, after we finished indexing our dataset into vector store we should setup the LLM that we will use it in our chatbot application.
We will create ChatOllama object and set LLM to llama3.1 which we already installed locally on our device using Ollama
from langchain_ollama import ChatOllama
# Create a ChatOllama object
chat_llama3 = ChatOllama(model="llama3.1", temperature=0.7)
Then we only need to send prompt query to it using the vector store index object that we created for our dataset
answer = index.query(prompt, llm=chat_llama3)
Now we can start testing our RAG chatbot application and ask it questions related to our dataset content
prompt = "Can you list job title for only technology industry?"
answer = index.query(prompt, llm=chat_llama3)
The RAG chatbot will generate the following response based on the dataset information provided to it
Here are the job titles from the provided context that belong to the Technology industry:
1. Marketing Specialist
2. AI Researcher (x3)
To streamline the process of prompting the RAG chatbot we can run the following loop
prompt = ""
while prompt.lower() != "exit":
# Use ChatOllama object to answer questions
prompt = input("Enter your query: ")
answer = index.query(prompt, llm=chat_llama3)
print("Llama3 Chatbot: " + answer)
You can check the complete conversation about our dataset here
Enter your query: can you check job title for location in London?
Llama3 Chatbot: Here are the job titles with their corresponding locations
there are two job titles with their location in London:
1. Marketing Specialist: Finance, Small, London
2. Sales Manager: Retail, Medium, Sydney -> no, actually...
3. AI Researcher: Retail, Large, London
I hope that helps!
Enter your query: Can you check location of Large company size?
Llama3 Chatbot: Based on the data provided, here are the locations for companies with a large company size:
* Marketing Specialist: Large (Singapore)
* AI Researcher: Large (San Francisco and London)
Let me know if I can help with anything else!
Enter your query: can you list job title that need Python skills?
Llama3 Chatbot: Here are the job titles from the provided data that require Python skills:
* Marketing Specialist
* AI Researcher (multiple instances)
* Software Engineer
Note: Some other roles may also require Python skills not listed here. This answer is based on the specific data provided.
Enter your query: can you check industry that has high AI adoption level?
Llama3 Chatbot: Based on the provided data, I see the following industries with High AI Adoption Level:
* Entertainment (has multiple instances)
* Finance (has multiple instances)
Enter your query: can you check job title that is not remote friendly ?
Llama3 Chatbot: Based on the provided data, I can identify the following job titles that are not "Remote Friendly":
1. Sales Manager (Retail, Small, Berlin)
2. UX Designer (Education, Large, San Francisco)
3. AI Researcher (Manufacturing, Large, Tokyo)
Enter your query: can you list location and industry for only software engineer?
Llama3 Chatbot: Based on the provided context, here is a list of locations and industries for Software Engineers:
1. Manufacturing
- Singapore (Medium company size)
2. Entertainment
- Dubai (Small company size)
Enter your query: can you mention job title and industry that has stable job growth projection?
Llama3 Chatbot: Based on the provided context, here are some examples of Job Title and Industry with a Stable job growth projection:
- Product Manager, Finance (Large, San Francisco)
- AI Researcher, Retail (Large, London)
As we see from the conversation we developed chatbot that can interact with external knowledge data source using RAG technique and Llama 3.1 8B large language model that run locally on our device.
You can find the dataset and source code for this RAG Ollama-Chatbot project on this GitHub repository.
Follow me on Linkedin Mohammad Oghli for more interesting technology articles!
Created by Mohamad Oghli
Subject Matter Expert (CompTIA Data+ SME). Fulbright grantee. Data Science & Business Analytics Master Student. Research intern in bioimaging. Actively seeking opportunities in summer 2025
6 个月I’m almost there
Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer
7 个月While customizing AI chatbots with specific datasets offers potential benefits, it also raises concerns about bias amplification and the ethical implications of tailoring information to particular user groups. The recent controversy surrounding biased outputs in large language models like GPT-3 underscores the need for robust safeguards against unintended consequences. How would your approach ensure fairness and mitigate potential harm when deploying a customized RAG chatbot on sensitive topics?