登录查看更多内容

Build AI RAG Chatbot with Ollama and LangChain

Mohammad Oghli

Software Lead @ Archireef | Data Solutions | MLOps | Tech Author

发布日期: 2024年8月29日

Today I will demonstrate in this article how to build your own AI chatbot on your customized dataset using Retrieval-Augmented Generation (RAG).

We will use to develop the RAG chatbot: Ollama to run the Llama 3.1 LLM locally on your device and LangChain framework to build chatbot application.

First of all what is Retrieval-Augmented Generation (RAG) ?

RAG is AI development technique where a large language model (LLM) is connected to an external knowledge base to improve the accuracy and quality of its responses.

In brief RAG technique allow us to feed specific dataset to the LLM and ask questions about it.

Basic RAG Chatbot Pipeline

There are 2 basic steps we need to follow in order to create RAG chatbot:

First we should index our dataset into Vector store: here we feed specific data documents after applying some operations on it to embed it and store it in vector store.
After that we can query the indexed dataset in vector store and retrieve relevant information about it and then LLM generates response based on user prompt and the retrieval information of indexed dataset.

Build RAG Chatbot Workflow

We can divide our work into these sections:

Install Ollama and download Llama 3.1 8B on your device to run it locally
Prepare the customized dataset that will be feed to LLM (Llama 3.1)
Index the dataset into Vector store using LangChain
Setup Llama 3.1 LLM with LangChain and integrate our indexed dataset query to the LLM prompt

Install Ollama and run Llama 3.1 locally

we will use Ollama platform to download and run open source LLMs locally on our device. In our project we will use Llama 3.1 8B by Meta.

First we should install Ollama on our device

To install it on Linux OS Run this command:

    curl -fsSL https://ollama.com/install.sh | sh

if you are using Windows or macOS you can check installation section here.

After installing Ollama we can now download and run any of the following open source models locally on our device

we will download and install Llama 3.1 8B parameters which require at least 4.7 GB hard disk space and 8 GB of RAM available to run the model.

Run this command to download Llama 3.1 model on your device

   ollama run llama3.1

Then after the model download completed it will start automatically in your terminal and you can chat with it, to get out from chat press CTRL+D

you can check its meta information by running

ollama show llama3.1

Also you can show list of downloaded models on your device

ollama list

Prepare the customized dataset for RAG on Llama 3.1

At this step we will use this public dataset AI-Powered Job Market Insights on Kaggle platform.

The dataset contains the following columns attributes:

Job_Title
Industry

Company_Size
Location
AI_Adoption_Level
Automation_Risk
Required_Skills
Salary_USD
Remote_Friendly
Job_Growth_Projection

We will take only the first 25 records of this dataset for the project demo to minimize the processing time of embedding and indexing the dataset in Vector store locally.

You can download the dataset from Kaggle and save only the first 25 records into CSV file.

领英推荐

Advanced Technical Analysis of Agent based RAGs:…

Dhanraj Dadhich 2 周前

Towards Advanced RAG

Relevance AI 10 个月前

Latest AI Trends: Large Context Windows…

Lingaro 5 个月前

Index the dataset into vector store using LangChain

First we should setup our project and install python packages dependencies in our virtual environment

You can view the project structure here:

dataset/ directory for the data we need to index
chatbot.py python script for the RAG chatbot application
requirements.txt for project python packages

The requirements file contains the following packages

langchain==0.2.15
langchain-community==0.2.14
langchain-ollama==0.1.2

To install these python packages in your virtual environment run

pip install -r requirements.txt

Now we can load our dataset and index it into a vector store

We can load the dataset using TextLoader method which allow us to load any document file for different data formats such as text (.txt), spread sheets (.csv) ...etc

from langchain_community.document_loaders import TextLoader

# Create a TextLoader object
loader = TextLoader("dataset/ai_job_market_insights_mini.csv")

Here we load our ai_job_market_insights_mini.csv data sheet

Then we need to create embedding object that will be used to index our dataset into vector store. In order to create the object we need to use specific embedding model.

What is Embedding models?

Embedding models create a vector representation of a piece of text. You can think of a vector as an array of numbers that captures the semantic meaning of the text.

In our project we will use OllamaEmbeddings model

from langchain_community.embeddings import OllamaEmbeddings

# Create an OllamaEmbeddings object
embeddings = OllamaEmbeddings(model="llama3.1")

Then we should use VectorstoreIndexCreator wrapper method to start indexing our dataset into vector store. this method by default will use InMemoryVectorStore which will store the dataset index in your device memory during running RAG chatbot and doesn't need setting up. you can also use other vector databases such as Chroma but you have to setup the vector database before running the application.

In this step we will create VectorstoreIndexCreator object and then index our loaded dataset into the vector store

from langchain.indexes import VectorstoreIndexCreator

# Create a VectorstoreIndexCreator object
index_creator = VectorstoreIndexCreator(embedding=embeddings)

# Call from_loaders method
index = index_creator.from_loaders([loader])
print("indexing document in vector store completed!")

The indexing operation will take some time depending on the size of your dataset and the computing resources available in your device.

Setup Llama 3.1 LLM and integrate our indexed dataset with it

Finally, after we finished indexing our dataset into vector store we should setup the LLM that we will use it in our chatbot application.

We will create ChatOllama object and set LLM to llama3.1 which we already installed locally on our device using Ollama

from langchain_ollama import ChatOllama

# Create a ChatOllama object
chat_llama3 = ChatOllama(model="llama3.1", temperature=0.7)

Then we only need to send prompt query to it using the vector store index object that we created for our dataset

answer = index.query(prompt, llm=chat_llama3)

Now we can start testing our RAG chatbot application and ask it questions related to our dataset content

prompt = "Can you list job title for only technology industry?"
answer = index.query(prompt, llm=chat_llama3)

The RAG chatbot will generate the following response based on the dataset information provided to it

Here are the job titles from the provided context that belong to the Technology industry:

1. Marketing Specialist
2. AI Researcher (x3)

To streamline the process of prompting the RAG chatbot we can run the following loop

prompt = ""
while prompt.lower() != "exit":
    # Use ChatOllama object to answer questions
    prompt = input("Enter your query: ")
    answer = index.query(prompt, llm=chat_llama3)
    print("Llama3 Chatbot: " + answer)

You can check the complete conversation about our dataset here

Enter your query: can you check job title for location in London?
Llama3 Chatbot: Here are the job titles with their corresponding locations

there are two job titles with their location in London:

1. Marketing Specialist: Finance, Small, London
2. Sales Manager: Retail, Medium, Sydney -> no, actually... 
3. AI Researcher: Retail, Large, London

I hope that helps!

Enter your query: Can you check location of Large company size?     
Llama3 Chatbot: Based on the data provided, here are the locations for companies with a large company size:

* Marketing Specialist: Large (Singapore)
* AI Researcher: Large (San Francisco and London)

Let me know if I can help with anything else!

Enter your query: can you list job title that need Python skills?
Llama3 Chatbot: Here are the job titles from the provided data that require Python skills:

* Marketing Specialist
* AI Researcher (multiple instances)
* Software Engineer 

Note: Some other roles may also require Python skills not listed here. This answer is based on the specific data provided.

Enter your query: can you check industry that has high AI adoption level?
Llama3 Chatbot: Based on the provided data, I see the following industries with High AI Adoption Level:

* Entertainment (has multiple instances)
* Finance (has multiple instances)

Enter your query: can you check job title that is not remote friendly ?
Llama3 Chatbot: Based on the provided data, I can identify the following job titles that are not "Remote Friendly":

1. Sales Manager (Retail, Small, Berlin)
2. UX Designer (Education, Large, San Francisco)
3. AI Researcher (Manufacturing, Large, Tokyo)

Enter your query: can you list location and industry for only software engineer?
Llama3 Chatbot: Based on the provided context, here is a list of locations and industries for Software Engineers:

1. Manufacturing
   - Singapore (Medium company size)
2. Entertainment
   - Dubai (Small company size)

Enter your query: can you mention job title and industry that has stable job growth projection?
Llama3 Chatbot: Based on the provided context, here are some examples of Job Title and Industry with a Stable job growth projection:

- Product Manager, Finance (Large, San Francisco)
- AI Researcher, Retail (Large, London)

As we see from the conversation we developed chatbot that can interact with external knowledge data source using RAG technique and Llama 3.1 8B large language model that run locally on our device.

You can find the dataset and source code for this RAG Ollama-Chatbot project on this GitHub repository.

Follow me on Linkedin Mohammad Oghli for more interesting technology articles!

Created by Mohamad Oghli

Tabarek Ayad Al-Mkhebir

Subject Matter Expert (CompTIA Data+ SME). Fulbright grantee. Data Science & Business Analytics Master Student. Research intern in bioimaging. Actively seeking opportunities in summer 2025

6 个月

I’m almost there

Godwin Josh

Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer

7 个月

While customizing AI chatbots with specific datasets offers potential benefits, it also raises concerns about bias amplification and the ethical implications of tailoring information to particular user groups. The recent controversy surrounding biased outputs in large language models like GPT-3 underscores the need for robust safeguards against unintended consequences. How would your approach ensure fairness and mitigate potential harm when deploying a customized RAG chatbot on sensitive topics?

1 次回应

查看更多评论

要查看或添加评论，请登录

Mohammad Oghli的更多文章

Deploying and Running CVAT with SAM Integration using Docker and Nuclio

2025年3月20日

Deploying and Running CVAT with SAM Integration using Docker and Nuclio

Today in this complete guide I will demonstrate in details how to deploy and run CVAT annotation application with Meta…
ML Models Containerization using Docker [MLOps]

2023年11月21日

ML Models Containerization using Docker [MLOps]

Today I will demonstrate in this detailed article how to containerize Machine Learning Model and deploy it using Docker…
Integrating Multi Container Docker Compose Volume with AWS S3

2023年7月30日

Integrating Multi Container Docker Compose Volume with AWS S3

In this article I will demonstrate how to mount docker container volume (EBS storage on AWS EC2 by default) of Docker…

7 条评论
The End of Traditional Computer Programming and the Emerging of AI Powered NLP Programming

2023年6月8日

The End of Traditional Computer Programming and the Emerging of AI Powered NLP Programming

In this article I will discuss how the rapid advancement of Artificial Intelligence today has greatly impacted one of…
Deploying Multi Container Docker Compose Application on AWS EC2

2023年5月16日

Deploying Multi Container Docker Compose Application on AWS EC2

Today in this detailed article I will demonstrate how to deploy simple Voting App running across multiple Docker…

11 条评论
How to Segment Figures and Text Region in Newspaper using Layout Parser

2022年11月25日

How to Segment Figures and Text Region in Newspaper using Layout Parser

Today I will demonstrate in this detailed article how to use Python Layout-Parser module to segment figures and text…

2 条评论
Leveraging Python FAAS to Deploy Impactful Data and ML Services

2022年11月1日

Leveraging Python FAAS to Deploy Impactful Data and ML Services

In this article I will talk about my experience with Daisi Platform as software developer and as a participant in The…
??????? ??? ????? Software ????? ???? Hardware ???? ????? ?????????

2021年6月12日

??????? ??? ????? Software ????? ???? Hardware ???? ????? ?????????

??????? ???? ????? ?? ????? ????? ?? ???? ?? ??? ??????? ?? ???? ??????????? ????? ??????? ?????? ?????? ???? ?????…

See all articles

Build AI RAG Chatbot with Ollama and LangChain

Mohammad Oghli

Software Lead @ Archireef | Data Solutions | MLOps | Tech Author

First of all what is Retrieval-Augmented Generation (RAG) ?

Basic RAG Chatbot Pipeline

Build RAG Chatbot Workflow

Install Ollama and run Llama 3.1 locally

Prepare the customized dataset for RAG on Llama 3.1

领英推荐

Index the dataset into vector store using LangChain

Setup Llama 3.1 LLM and integrate our indexed dataset with it

Mohammad Oghli的更多文章

社区洞察

其他会员也浏览了

Components of a RAG System: A Comprehensive Overview

MarTech AI #39: GPT-4o Mini, Project Strawberry, SpreadsheetLLM, Microsoft CrowdStrike Outage & Youtube Data Controversy

Rapid AI Insights: Edition 40

How to Provide Data to Your Gen AI Application

DeepSeek Synthetic Data Lessons + Flywheels, RAGs, and Other Breadcrumbs

Building Multi-Agent AI Systems: A Comparative Analysis of OpenAI and Ollama Implementations

The Synergy of Symbolic and Non-Symbolic AI

Retrieval-Augmented Generation (RAG) Techniques

AI Weekly Updates 0120

?AI Flash: Mar 14, 2025

First of all what is Retrieval-Augmented Generation (RAG) ?

Basic RAG Chatbot Pipeline

Build RAG Chatbot Workflow

Install Ollama and run Llama 3.1 locally

Prepare the customized dataset for RAG on Llama 3.1

领英推荐

Index the dataset into vector store using LangChain

Setup Llama 3.1 LLM and integrate our indexed dataset with it

Mohammad Oghli的更多文章

Deploying and Running CVAT with SAM Integration using Docker and Nuclio

ML Models Containerization using Docker [MLOps]

Integrating Multi Container Docker Compose Volume with AWS S3

The End of Traditional Computer Programming and the Emerging of AI Powered NLP Programming

Deploying Multi Container Docker Compose Application on AWS EC2

How to Segment Figures and Text Region in Newspaper using Layout Parser

Leveraging Python FAAS to Deploy Impactful Data and ML Services

??????? ??? ????? Software ????? ???? Hardware ???? ????? ?????????

社区洞察

其他会员也浏览了

Components of a RAG System: A Comprehensive Overview

MarTech AI #39: GPT-4o Mini, Project Strawberry, SpreadsheetLLM, Microsoft CrowdStrike Outage & Youtube Data Controversy

Rapid AI Insights: Edition 40

How to Provide Data to Your Gen AI Application

DeepSeek Synthetic Data Lessons + Flywheels, RAGs, and Other Breadcrumbs

Building Multi-Agent AI Systems: A Comparative Analysis of OpenAI and Ollama Implementations

The Synergy of Symbolic and Non-Symbolic AI

Retrieval-Augmented Generation (RAG) Techniques

AI Weekly Updates 0120

?AI Flash: Mar 14, 2025