External Knowledge Base for LLMs: Leveraging Retrieval Augmented Generation Framework with AWS Bedrock and FAISS
Written by: Savan Rupani
Introduction
In the realm of artificial intelligence and language models, the pursuit of enhancing their capabilities is a constant endeavor. Today, we embark on a journey to explore an innovative approach that has the potential to revolutionize the way we harness the power of AI. Picture a scenario where your language model not only excels at answering domain-specific questions with precision but also possesses the ability to sift through your organization’s confidential documents, all while effortlessly staying updated with the latest information. This is the promise of integrating an external knowledge base into your Large Language Model (LLM).
In this blog post, we delve into the vast potential of this groundbreaking approach. We’ll uncover how augmenting your LLM with external knowledge can be a game-changer in AI, and we’ll demonstrate why it stands out as a cost-effective and straightforward alternative to the conventional method of fine-tuning models with specific knowledge.
Problem Statement:
The AI Knowledge Gap: Bridging Domain-Specific Questions with LLM Models
In the world of cutting-edge AI and language models, we encounter a pressing challenge: how to enhance their performance in addressing domain-specific questions, unlocking insights from confidential company documents, and ensuring they remain current with the latest information. The problem is the inherent limitation of Large Language Models (LLMs) in offering precise, up-to-date, and domain-specific answers.
Fine-tuning these LLMs, while a common practice, presents its own set of hurdles. It demands a significant financial investment, consumes time and resources, and often yields suboptimal results. Furthermore, even after such arduous fine-tuning, these models may still fall short when confronted with highly specialized inquiries or data residing within private documents.
Compounding this issue, LLMs are typically trained on a fixed dataset, rendering them unable to adapt to the unique nuances of specific domains. This rigidity can lead to outdated responses, as these models may lack access to the most recent or real-time information.
The problem statement, therefore, revolves around the need to devise a cost-effective, easily implementable solution that equips LLMs with the ability to overcome these limitations. We must find a way to bridge the gap between the innate capabilities of LLMs and the demands of domain-specific, confidential, and dynamically evolving knowledge. This blog post addresses this challenge by introducing the concept of integrating an external knowledge base into LLMs.
Implementation
A. Process PDF files.
B. Get embedding
C. Store embedding to the vector store.
There were a few challenges with directly using pip to install the awscli, boto3, and botocore Python packages. To resolve the issue, we are downloading dependencies from the link provided in the code and installing them in the virtual Python environment.
#!/bin/sh
set -e
echo "(Re)-creating directory"
rm -rf ./dependencies
mkdir ./dependencies
cd ./dependencies
echo "Downloading dependencies"
curl -sS https://d2eo22ngex1n9g.cloudfront.net/Documentation/SDK/bedrock-python-sdk.zip > sdk.zip
echo "Unpacking dependencies"
if command -v unzip &> /dev/null
then
unzip sdk.zip && rm sdk.zip && echo "Done"
else
echo "'unzip' command not found: Trying to unzip via Python"
python -m zipfile -e sdk.zip . && rm sdk.zip && echo "Done"
fi
# Make sure you ran `download-dependencies.sh` from the root of the repository first!
%pip install --no-build-isolation --quiet --force-reinstall \
../dependencies/awscli-*-py3-none-any.whl \
../dependencies/boto3-*-py3-none-any.whl \
../dependencies/botocore-*-py3-none-any.whl
领英推荐
%pip install --quiet langchain==0.0.249 \
pypdf \
faiss-cpu \
unstructured \
pdf2image \
pdfminer-sixp
The following piece of code will read the PDF document and split it into pages for processing. We are not only converting these documents into text but also retaining the document object, which contains additional information about the document and its pages. This will later help us identify the source of the information.
from langchain.document_loaders import PyPDFLoader
loader = PyPDFLoader("content.pdf")
pages = loader.load_and_split()
We will harness the capabilities of AWS Bedrock’s foundational model offering due to its seamless integration with various AWS services and its support for Langchain. In this demonstration, our chosen tools will be anthropic.claude-v1 for text generation and amazon.titan-embed-g1-text-02 to generate embeddings for our vector store.
# Create an instance of BedrockEmbeddings
# This class is used to interact with a Bedrock model for embeddings.
model_id = "anthropic.claude-v1"
embed_model_id = "amazon.titan-embed-g1-text-02"
titan_text_generation_config = {
"maxTokenCount":2048,
"stopSequences":[],
"temperature":0.5,
"topP":0.5
}
claude_text_generation_config = {
"max_tokens_to_sample": 100
}
if model_id == "anthropic.claude-v1":
text_generation_config = claude_text_generation_config
else:
text_generation_config = titan_text_generation_config
bedrock_embed_llm = BedrockEmbeddings(
model_id = embed_model_id
, client = boto3_bedrock
)
bedrock_llm = Bedrock(
model_id = model_id
, client = boto3_bedrock
)
bedrock_llm.model_kwargs = text_generation_configpython
We will employ FAISS as an in-memory vector store and index to store and retrieve document embeddings. This approach works exceptionally well for a small number of documents, allowing us to read and load all the documents into the application efficiently.
# Create a VectorStore using FAISS
# VectorStore is used to store and search for vectors (embeddings) efficiently.
vectorstore = FAISS.from_documents(
documents = pages[:25]
, embedding = bedrock_embed_llm
)
# Create a VectorStoreIndexWrapper
# VectorStoreIndexWrapper is used to add an indexing layer on top of the 'vectorstore' for efficient searching.
wrapper_store = VectorStoreIndexWrapper(
vectorstore = vectorstore
)
Prompt template
prompt_template = """
Use the following pieces of context to answer the question.
If the answer is not form the context then mention accordingly.
Context: {context}
Question: {question}
Answer:"""
PROMPT = PromptTemplate(
template=prompt_template, input_variables=["context", "question"]
)
Query
question = "Where should shareholders call with questions?"
qa = RetrievalQA.from_chain_type(
llm = bedrock_llm,
chain_type = "stuff",
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 10}),
return_source_documents = True,
chain_type_kwargs = {"prompt": PROMPT},
)
result = qa({"query": question})
print(result['result'])
Additional Considerations
Conclusion
In conclusion, incorporating an external knowledge base into the foundational model represents a cost-effective and practical solution for improving the capabilities of our LLM. This approach enables the model to proficiently address domain-specific questions, retrieve valuable insights from the company’s private documents, and stay up-to-date with the latest facts. By choosing this approach, we can avoid the complexities and resource-intensive nature of fine-tuning the model with specific knowledge. It not only enhances the performance of LLM but also streamlines the implementation process, making it a smart choice for harnessing the power of AI in the organization.
References