Creating Your Own Chatbot with Amazon Bedrock

Creating Your Own Chatbot with Amazon Bedrock


The opinions expressed in this blog are my own, and do not necessarily represent the opinions of Amazon, Amazon Web Services, or any other entity not named David Carnahan.


This project was A LOT of fun.

My goal was to use Amazon Bedrock and Langchain for the backend and Streamlit for the front end ... then testing out two or more different Large Language Models.


Source of Inspiration: Udemy Course by Rahul Trisal


Version 1: Streamlit Chatbot with Bedrock & Llama2

High Level Architecture


For this app ... there is a frontend script and a backend script.

Front end

The front end contains the following:

  • Title
  • Memory & Chat history
  • Input text
  • Session state

# import libraries
import streamlit as st
import chatbot_backend as demo

# parameters
st.title(":star-struck: Amazon Bedrock Chatbot") # title

# add langchain memory to session state
if 'memory' not in st.session_state:
    st.session_state.memory = demo.demo_memory()

# add chat history to session
if 'chat_history' not in st.session_state:
    st.session_state.chat_history = []

# render chat history
for message in st.session_state.chat_history:
    with st.chat_message(message["role"]):
        st.markdown(message["text"])

# input text box for chatbot
input_text = st.chat_input("Powered by Amazon Bedrock & Claude 2")
if input_text:
    with st.chat_message("user"):
        st.markdown(input_text)

    # Append user input to chat history
    st.session_state.chat_history.append({"role":"user", "text":input_text})

    # Generate chat response using the chatbot instance
    chat_response = demo.demo_conversation(input_text=input_text, memory=st.session_state.memory)

    # Display the chat response
    with st.chat_message("assistant"):
        st.markdown(chat_response["response"])

    # Append assistant's response to chat history
    st.session_state.chat_history.append({"role":"assistant", "text":chat_response["response"]})        

I adjusted a few points lines in the code so that you would always have the "response" of the chat_response ... rather than the whole history as coded by the original course.

Backend

The backend is straightforward as well. It contains three functions that mimic the high level architecture shown above.

  • The Bedrock connection (using langchain) + model parameters for API
  • Chat message (conversation) and response
  • Memory & chat history

# import modules
import os
import boto3
from langchain.llms.bedrock import Bedrock
from langchain_anthropic import AnthropicLLM
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain

# function that invokes bedrock model
""" def demo_chatbot():
    demo_llm = Bedrock(
        credentials_profile_name="default",
        model_id="meta.llama2-70b-chat-v1",
        model_kwargs={
            "temperature": 0.9,
            "top_p": 0.5,
            "max_gen_len": 512
        }
    )
    return demo_llm """

def demo_chatbot():
    demo_llm = Bedrock(
        credentials_profile_name="default",
        model_id="anthropic.claude-v2:1",
        model_kwargs={
            "temperature": 0.9,
            "top_p": 0.5,
            "max_tokens_to_sample": 512
        }
    )
    return demo_llm

# function for conversation memory
def demo_memory():
    llm_data = demo_chatbot()
    memory = ConversationBufferMemory(
        llm=llm_data,
        max_token_limit=512
    )
    return memory

# function for conversation
def demo_conversation(input_text, memory):
    llm_chain_data = demo_chatbot()
    llm_conversation = ConversationChain(
        llm=llm_chain_data,
        memory=memory,
        verbose=True
    )        

# chat response using invoke (prompt template)
    chat_reply = llm_conversation.invoke(input=input_text)
   return chat_reply         

There are several things I want to point out:

  1. The first bedrock function is commented out ... but is the one that used Llama2 -- we'll talk about why I did that and decided to use Claude 2 in a little bit.
  2. You can provide a verbose versus non-verbose version of the chatbot by toggling between True and False with that parameter, or by increasing the max token limit.
  3. This script instantiates an object of the LLM three times versus doing it once (which we will also talk about later). This likely impacts the performance of the chatbot, but for such a small project doesn't matter much. Nevertheless, it is likely better if you do the instantiation once and then carry out all the necessary steps needed from the model.


Llama 2, 3, 4, ...

When I used Llama2, there was a lot of verbosity ... some of which was in the name of a conversation between human and AI, but the problem is I wasn't the one creating the human prompts. Look at the image below to see what I mean.

I provided the initial question ... but Llama 2 generated additional 'human' questions for the LLM to answer. Not sure why ... but this disappeared when I changed the model from Llama 2 to Claude 2.

Version 2: Streamlit Chatbot with Bedrock & Claude 2

I took the code for the frontend and backend and asked AI to improve its efficiency, and the following is the code it provided.

Front end

import streamlit as st
from efficient_backend import DemoChatbot  # Adjusted import to match the new backend structure

# Initialize chatbot instance & history if not already in session
if 'chatbot' not in st.session_state:
    st.session_state.chatbot = DemoChatbot()
if 'chat_history' not in st.session_state:
    st.session_state.chat_history = []  # Ensuring chat_history is initialized

# Title chatbot
st.title("?? Efficient Chatbot")  # Title with an emoji

# Render chat history
for message in st.session_state.chat_history:
    with st.chat_message(message["role"]):
        st.markdown(message["text"])

# Input text box for chatbot
input_text = st.chat_input("Powered by Claude 2")
if input_text:
    with st.chat_message("user"):
        st.markdown(input_text)

    # Append user input to chat history
    st.session_state.chat_history.append({"role": "user", "text": input_text})

    # Generate chat response using the chatbot instance
    chat_response = st.session_state.chatbot.demo_conversation(input_text)

    # Display the chat response
    with st.chat_message("assistant"):
        st.markdown(chat_response["response"])

    # Append assistant's response to chat history
    st.session_state.chat_history.append({"role": "assistant", "text": chat_response["response"]})        

Backend

import os
import boto3
from langchain.llms.bedrock import Bedrock
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain

class DemoChatbot:
    def __init__(self, model_id="anthropic.claude-v2:1", model_kwargs=None, max_token_limit=512):
        self.model_id = model_id
        self.model_kwargs = model_kwargs if model_kwargs is not None else {
            "temperature": 0.9,
            "top_p": 0.5,
            "max_tokens_to_sample": 512
        }
        self.max_token_limit = max_token_limit
        self.llm = self.init_llm()
        self.memory = self.init_memory()

    def init_llm(self):
        try:
            demo_llm = Bedrock(
                credentials_profile_name="default",
                model_id=self.model_id,
                model_kwargs=self.model_kwargs
            )
            return demo_llm
        except Exception as e:
            print(f"Failed to initialize the LLM: {e}")
            return None

    def init_memory(self):
        if self.llm is not None:
            try:
                memory = ConversationBufferMemory(
                    llm=self.llm,
                    max_token_limit=self.max_token_limit
                )
                return memory
            except Exception as e:
                print(f"Failed to initialize memory: {e}")
        return None

    def demo_conversation(self, input_text):
        if self.llm is not None and self.memory is not None:
            try:
                llm_conversation = ConversationChain(
                    llm=self.llm,
                    memory=self.memory,
                    verbose=True
                )
                chat_reply = llm_conversation.invoke(input=input_text)
                return chat_reply
            except Exception as e:
                print(f"Error during conversation: {e}")
        return None

# Usage example
if __name__ == "__main__":
    chatbot = DemoChatbot()
    user_input = "Hello, how can you help me today?"
    response = chatbot.demo_conversation(input_text=user_input)
    print(response)
        

I really like the object oriented code for the backend of the 'efficient chatbot'. I deleted the usage example and then tested it by using streamlit commands for local hosting.

> streamlit run your_frontend_code.py        

Finally a couple of things I ran into that I want to make you aware of ...

  1. Each model will have its own parameters ... so if you look at the code above for Llama2 you'll notice that there is a parameter "max_gen_len": 512 for the token limits you want to place on the prompt and response length, but in Claude 2 that same parameter is named differently -- "max_tokens_to_sample": 512. You'll need to look at the models API parameters in bedrock or other documentation to know what the required parameters are for the model of interest.
  2. Additionally, you'll need to look up the model id for the model in the same place to be able to access the model (and you'll need to have permission to model for the API call to work).
  3. Finally, you'll need to configure your AWS credentials using your IDE of choice as always. There is a lot of documentation out there on how to do this.

Where do you find these API details?

You go to the Bedrock Providers section ... press on the LLM provider (in my case Anthropic) and the model version (Claude 2.1).

Then scroll down to the API section and you'll find the modelId and the different parameters in the body ... notice the 'max_tokens_to_sample' parameter?

The Results

Original with Claude 2


Efficient with Claude 2

Conclusion

Using Amazon Bedrock is pretty straight forward once you have credentials. Adding Langchain for the memory capability, and Streamlit for a basic UI turned out fairly simple as well. I know there can be as much complexity as the use case demands, but hopefully this shows you what could be possible with some elbow grease and time to read the docs.

要查看或添加评论,请登录

David Carnahan MD MSCE的更多文章

  • Harnessing the Data Tidal Wave: Exploring AWS for Unstructured Data Processing

    Harnessing the Data Tidal Wave: Exploring AWS for Unstructured Data Processing

    The opinions expressed in this blog are my own, and do not necessarily represent the opinions of Amazon, Amazon Web…

    4 条评论
  • Meal Planner App - Step 1

    Meal Planner App - Step 1

    NOTE: The opinions expressed in this blog are my own and do not represent that of my employer Amazon / AWS Meal Planner…

    2 条评论
  • The Meal App Project Plan

    The Meal App Project Plan

    NOTE: The opinions expressed in this blog are my own and do not represent that of my employer Amazon / AWS Introduction…

    2 条评论
  • Comparing ChatGPT vs Bard for Writing a Draft Scene

    Comparing ChatGPT vs Bard for Writing a Draft Scene

    This post will be personal. I love to write! Been working on a novel for about 10 months now .

    3 条评论
  • Using GTP4 for Data Science: Experiment 2

    Using GTP4 for Data Science: Experiment 2

    NOTE: The opinions expressed in this blog are my own and do not represent that of my employer Amazon / AWS Now, let's…

    1 条评论
  • Using GPT4 for Data Science: Experiment 1

    Using GPT4 for Data Science: Experiment 1

    NOTE: The opinions expressed in this blog are my own and do not represent that of my employer Amazon / AWS. I have been…

    7 条评论
  • GPT3 Healthcare Test #1

    GPT3 Healthcare Test #1

    Consider this clinical encounter and let me know when you are done. This is a visit with a 46 yo gentleman.

    18 条评论
  • Learning Journal Entry: Jupyter Notebook on AWS EC2

    Learning Journal Entry: Jupyter Notebook on AWS EC2

    This lab will be the first of many focused on how to use AWS for data science. The format will be like the one you used…

    1 条评论
  • Better Patient Engagement with Amazon Connect

    Better Patient Engagement with Amazon Connect

    When I was a staff physician at Wilford Hall Medical Center, I was amazed at how much work I did each day that was…

    8 条评论
  • Walking the High Wire

    Walking the High Wire

    "Every great dream begins with a dreamer. Always remember, you have within you the strength, the patience, and the…

    1 条评论

社区洞察

其他会员也浏览了