Chat with PDFs using Generative AI Part 4 . Using Llama-2 Model with FAISS as Vector DB and chainlit.

Satish Srinivasan

Cloud Architect I Cloud Security Analyst I Specialist - AWS & Azure Cloud. AWS Community Builder| AWS APN Ambassador

发布日期: 2023年8月5日

Chat with PDFs using Generative AI Part 4 using Llama-2 Model with FAISS as Vector DB and chainlit.

In this blog, we will demonstrate how to create a knowledge bot using FAISS Vector Db and Llam-2 Open-source models with the model being stored locally.

More details are available at https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML.

Lang Chain is a framework designed to simplify the creation of applications using large language models. As a language model integration framework, LangChain's use-cases largely overlap with those of language models in general, including document analysis and summarization, chatbots, and code analysis.

Chainlit is?a Python library that lets us build Chat Interfaces for Large Language Models. It is integrated with Lang Chain. We can get more details and samples of how to use chainlit at this website https://docs.chainlit.io/overview.

We will try to replicate the same with meta-llama/Llama-2-70b-chat-hf model when we are granted access to the same by meta.

We will be covering the following topics

FAISS as vector data source
Creating a PDF bot based on Llama2 opensource model with FAISS DB as the vector data source.
Run a chainlit app to ask questions on the index created.

Architecture overview

The Architecture diagram for PDF Bot using FAISS Vector as the embedding data source

Prerequisites

The EC2 instance in which we will run the python code.

AWS Management Console
EC2
Security Groups

Walkthrough

we will create a pdf bot using FAISS Vector DB and llama-2 Open-source model.

Let us create the necessary security groups required. EC2 security group inbound rules.

Next let us create the ec2 instance and install the necessary packages.

Press “Launch instance”.

The instance is up and running. We will “SSH” into the instance and show the setup required for running this demo. Make sure you are using lang chain version 0.252. In case you already have an environment where lang chain exists. The Async does not work with lower/older versions of Lang chain and when we run the bot we will get the error Async error.

These are the packages that are required.

Once the packages are installed, we will download the model “llama-2-7b-chat.ggmlv3.q8_0.bin” locally.

Steps to setup a virtual environment. Let us first ssh to the EC2 instance.

Run the command as shown below to set up virtual environment

Prasanth V 3 个月前

Top Node.js Libraries for AI Integration: comparing…

Ketan Raval 3 个月前

Chat with PDFs using Generative AI Part 2 using…

Satish Srinivasan 1 年前

Run the below commands to activate the virtual environment

?Next, we will install the required packages

This package is installed. Let us install all the other packages. Next, we need to down load the model we are going to use for semantic search. “llama-2-7b-chat.ggmlv3.q8_0.bin”. run the command "wget https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML/resolve/main/llama-2-7b-chat.ggmlv3.q8_0.bin"

Next, we will copy the PDF file on which are we going to demo question answer from the S3 bucket. The PDF file is loaded to the S3 bucket. Attach role to the ec2 instance which has S3full access policy.

We will next copy the file from s3 bucket to the required folder in ec2 instance.

Next, we will create the FAISS VECTOR DB and store the DB index locally. In the next part we will Load this created DB and run semantic search.

Index creation Part

The index is created and saved locally the index name is “faiss-heartdisease-index” as shown below.

Next, we will Load this index and do a semantic search using the local model we downloaded previously “llama-2-7b-chat.ggmlv3.q8_0.bin”. The code in app.py file is shown below. Once the files are created we can run the app and ask questions on the index we built.

Start Asking question to the bot.

Cleaning Up

Shutdown and terminate the EC2 instance in which we have deployed the bot. All these activities are done using AWS Console.

Conclusion

In this Blog, we learned how to create a PDF based Question answer bot using llama-2 Open-source models with Vector DB’s FAISS. Once we get access to the model generated by meta, we will repeat the whole process with that model and compare the answers. We can implement the same using streamlit as well. ?In the next series of this blog we will create a sample text summarization bot.

要查看或添加评论，请登录

查看全部

Chat with PDFs using Generative AI Part 4 . Using Llama-2 Model with FAISS as Vector DB and chainlit.

Satish Srinivasan

Cloud Architect I Cloud Security Analyst I Specialist - AWS & Azure Cloud. AWS Community Builder| AWS APN Ambassador

Chat with PDFs using Generative AI Part 4 using Llama-2 Model with FAISS as Vector DB and chainlit.

Architecture overview

Prerequisites

Walkthrough

领英推荐

Cleaning Up

Conclusion

更多精彩文章

社区洞察

其他会员也浏览了

AI lets anyone generate code. That isn’t a good thing.

Demand Forecasting with GPT-4 code interpreter - Hype or game changer?

Build a Retrieval Augmented System (RAG) system in just 4 lines of code!

The Problem With All AI Models & How To Solve It (Part 1)

AI Innovations: Unveiling the Latest Breakthroughs

A Hands-On Analysis Of The LLM Tooling Landscape (Part 1)

RAG #5 - A recap and progress check-in

Gen AI use cases - mission impossible? :)

TechCompass #83: Generative AI (Part 2)

Leverage AI and Network Thinking to Gain Insight with "Infranodus"

Chat with PDFs using Generative AI Part 4 using Llama-2 Model with FAISS as Vector DB and chainlit.

Architecture overview

Prerequisites

Walkthrough

领英推荐

Cleaning Up

Conclusion

Ticket Analysis using AWS Bedrock Anthropic Claude

2024年8月11日

Connecting to RDS DB in AWS Private Subnet using NLB in & RDS with NLB & RDS in seperate Region and VPC

2024年8月10日

AWS Bedrock Anthropic Claude based Guardrail enabled RAG with Kendra, AWS Knowledge Base and AWS Neptune as Data store

2024年8月10日

AWS Bedrock Anthropic Claude 3 Multi Modal Model Features

2024年6月2日

AWS Bedrock Guardrails with Anthropic Claude Sonnet Model.

2024年5月24日

Generative AI based RAG with Llama2 model using AWS Sagemaker jumpstart and Kendra as Data Source with Multilingual features

2024年5月4日

Image to Text using Muti Modal Model LLAVA in GGUF format.

2024年3月31日

Getting started with AWS Bedrock agents

2024年3月26日

Active-active Replication for PostgreSQL on Amazon RDS for PostgreSQL using pgactive

2024年1月2日

Generative AI Part 10 Text to sql using Langchain ,OpenAI and Llama Index

2023年10月24日

社区洞察

其他会员也浏览了

AI lets anyone generate code. That isn’t a good thing.

Demand Forecasting with GPT-4 code interpreter - Hype or game changer?

Build a Retrieval Augmented System (RAG) system in just 4 lines of code!

The Problem With All AI Models & How To Solve It (Part 1)

AI Innovations: Unveiling the Latest Breakthroughs

A Hands-On Analysis Of The LLM Tooling Landscape (Part 1)

RAG #5 - A recap and progress check-in

Gen AI use cases - mission impossible? :)

TechCompass #83: Generative AI (Part 2)

Leverage AI and Network Thinking to Gain Insight with "Infranodus"