Amazon Bedrock: Essentials to Build your own Generative AI application using Python
What is Amazon Bedrock?
Amazon Bedrock is a fully managed service that offers a choice of foundation models (FMs) that you need to build generative AI applications.
Amazon Bedrock is a serverless service, which means you don’t need to worry about managing the infrastructure.
Which FMs are available on Amazon Bedrock??
Amazon Bedrock support multiple FMs that includes language (Text Generation/Text summarization) and embeddings models from:
How can I use Amazon Bedrock?
Amazon Bedrock offers a playground that allows you to experiment with different FMs. It is very easy to use, you will need only to provide a prompt through a web interface and use the pretrained models. You can also use a fine-tuned model that has been adapted for your specific use case.
From the same interface and before entering your prompt, you will need to enable which FM model you would like to use and then you can select the enabled FM model and pass your input prompt to experiment different outputs as per the selected FM model.
What if I want to build my own Generative AI application using Amazon bedrock?
Voilà! Here is the part where you will use Python to invoke Amazon Bedrock to start talking to LLMs and integrate it within your application.
To get started, we need to install a few things:?
1- LangChain which is a framework for developing applications powered by large language models (LLMs).
2- pypdf which is to read pdf files. This is useful in case you will need to implement Retrieval Augmented Generation “RAG” Applications where you need for LLMs prompts to respond to queries about a given provided pdf files.
3- Faiss which is a library — developed by Facebook AI — used for efficient similarity search and clustering of dense vectors - known as “vectors stores”. It contains algorithms that search in sets of vectors of any size. This is useful to use in indexing files and grouping your files into your RAG application.
!pip3 install langchain==0.0.309 \
"pypdf>=3.8,<4" \
"faiss-cpu>=1.7,<2"
Some packages to import:
import boto3
import json
from datetime import datetime
from dateutil import tz
import base64
import io
from io import BytesIO
import os
import sys
from PIL import Image, ImageDraw, ImageFont
import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
from langchain.embeddings import BedrockEmbeddings
from langchain.llms.bedrock import Bedrock
from langchain.text_splitter import CharacterTextSplitter, RecursiveCharacterTextSplitter
from langchain.document_loaders import TextLoader, PyPDFLoader, PyPDFDirectoryLoader
from urllib.parse import urlparse
from langchain.chains.question_answering import load_qa_chain
from langchain.vectorstores import FAISS
from langchain.indexes import VectorstoreIndexCreator
from langchain.indexes.vectorstore import VectorStoreIndexWrapper
from langchain.prompts import PromptTemplate
from langchain.chains import RetrievalQA
import time
import requests
# Set up sessions and clients
region_name = "us-west-2"
boto3_bedrock_client = boto3.client(service_name='bedrock-runtime',region_name=region_name)
print("\nDone! Ready to go!")
In this tutorial we have used us-west-2 region. In addition, you will need to provide bedrock-runtime as an API service_name for running inference operations for the Amazon Bedrock models to be used.
Now let’s start passing inputs for the prompt.
领英推荐
prompt_data = "Explain Inflation to 8th graders"
modelId = 'anthropic.claude-v2'
body = json.dumps({
"messages": [{
"role": "user",
"content": [{
"type": "text",
"text": prompt_data
}]
}],
"anthropic_version": "bedrock-2023-05-31",
"system": "Only reply in English. Very important.",
"max_tokens": 2000,
"temperature": 0.1, #attribute for text creativity/uniqueness for prompt response.
"top_p": 0.99
})
response = boto3_bedrock_client.invoke_model(
body = body,
modelId = modelId
)
response_body = json.loads(response.get('body').read())
print(response_body['content'][0]['text'])
Here we have used claude-v2 model along with specific hyperparameters such as temperature, max_tokens, and top_p. You can use other hyperparameters such as stop_sequences, top_k etc. If you would like to read more about the different LLM hyperparameters: https://learnprompting.org/docs/basics/configuration_hyperparameters
When a Large Language Model wants to create a response, it needs to know what token is "close" or "similar" to the prompt question. Each token gets "embedded" and turned into a vector of thousand of dimensions.
To demonstrate how LLM embedding works to find a similarity of words, you can use the following code snippet that goes through a list of words and plots similar/close words against each other. We have used Titan model for creating word embeddings. Furthermore, we simplify visualization for vector dimension into 2D array using PCA - Principal Component Analysis for a simple illustration.??
llm = Bedrock(
model_id = "anthropic.claude-v2",
client = boto3_bedrock_client,
model_kwargs = {'max_tokens_to_sample':500}
)
bedrock_embeddings = BedrockEmbeddings(
model_id = "amazon.titan-embed-text-v1",
client = boto3_bedrock_client
)
embedding_dict = {}
wordlist = ['python',
'snake',
'Java',
'coffee',
'indonesia',
'germany',
'tea',
'cat'
]
for word in wordlist:
embedding_dict.update({word: np.array(bedrock_embeddings.embed_query(word))})
for thisword in embedding_dict:
print(f"{thisword} has {len(embedding_dict[thisword])} dimensions:\n{embedding_dict[thisword]}\n")
#combining the vectors into a 2D array
vectors = np.vstack(list(embedding_dict.values()))
labels = embedding_dict.keys()
# Perform PCA Principal Component Analysis for dimensionality reduction (reduce to 2 dimensions for visualization)
pca = PCA(n_components=2)
reduced_vectors = pca.fit_transform(vectors)
plt.scatter(reduced_vectors[:, 0], reduced_vectors[:, 1])
offset = 0.3
for i, label in enumerate(labels):
plt.annotate(label, (reduced_vectors[i, 0] + offset, reduced_vectors[i, 1] + offset))
plt.xticks([])
plt.yticks([])
plt.axis('off')
plt.title('Clustering Words based on similarity’')
plt.show()
How can I build my own RAG application using Amazon Bedrock?
Retrieval-Augmented Generation (RAG) is the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data sources before generating a response.?
One example of using RAG is to develop a chatbot for an organization where users enter a query prompt and your RAG application scans a list of uploaded PDF files (Policy/Guidelines etc.) and generates a response to the user query from the knowledge gained from the provided input PDF files.
This is where FAISS (Facebook AI Similarity Search) will be used to store and retrieve similar vectors from a given query prompt.
Your text included within the provided PDF files must be splitted into chunks of extracted text, and then you will need to send it to Bedrock for embedding. That data is then stored in a FAISS vector store.
In the following example, we use 1000 chars of extracted text. The process of reading text and storing it to vector through FAISS requires some time. Therefore, it is recommended to add an acceptable time delay within your code to prevent overloading API requests.
loader = PyPDFDirectoryLoader("uploads/")
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter(
chunk_size = 1000,
chunk_overlap = 100,
)
docs = text_splitter.split_documents(documents)
vectorstore_faiss = FAISS.from_documents(
docs,
bedrock_embeddings,
)
wrapper_store_faiss = VectorStoreIndexWrapper(vectorstore=vectorstore_faiss)
Once the user query is provided it gets embedded and compared against the data in the vector store. The relevant data gets sent to the LLM and processed.
The FAISS retrieval will handle the similarity searching to the user prompt against the relevant data from the given PDF files.
Here we declare a retrieval to return the top three matching results and provide a citation.
Note, that we have used a prompt template in order to give a concise answer. You can make use of the prompt filter to filter show/hide data as per the prompt scenario you require to implement.
query = """What is the acceptable Industry standard that I can follow within my tech role and approved by the company?"""
query_embedding = vectorstore_faiss.embedding_function(query)
prompt_template = """
Human: Use the following pieces of context to provide a concise answer to the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
<context>
{context}
</context
Question: {question}
Assistant:"""
PROMPT = PromptTemplate(
template=prompt_template, input_variables=["context", "question"]
)
qa = RetrievalQA.from_chain_type(
llm = llm,
chain_type = "stuff",
retriever = vectorstore_faiss.as_retriever(
search_type = "similarity", search_kwargs={"k": 3}
),
return_source_documents = True,
chain_type_kwargs = {"prompt": PROMPT}
)
answer = qa({"query": query})
print(f"Query: {query}\n")
print(f"Answer: {answer['result']}\n")
print(f"source information: {answer['source_documents'][0]}")
Conclusion
This is a very basic tutorial that should give you good insights on how to build your own Generative AI applications using Amazon Bedrock.
Keep an eye on https://aws.amazon.com/bedrock/faqs/ to check what's recent through Amazon Bedrock. Good luck!
Technical Instructor at AWS
7 个月Impressive write up, Hussain. Stay proactive, happy to see this on my timeline.
Teaching Artificial Intelligence, Machine Learning, and Security | AWS AAI Authorizer, Instructor Champion, and Mentor
7 个月Great article, Hussain! Glad my workshop helped you really understand Amazon Bedrock. Looking forward to see what you do next!