Hands on with AWS Bedrock
Amazon Bedrock image generated By Amazon Titan image generator using Bedrock Playground

Hands on with AWS Bedrock

Introduction

I have had the opportunity recently to work on my first couple of products with Bedrock. While I have been playing with it for a while (before general availability), I had not used it in a full system.

What is Bedrock

Amazon Bedrock is the AWS GenAI hub. I hosts a number of GenAI models as well as many other related GenAI features.

Types of GenAI

There are 3 basic 'modes' of GenAI models

  • Text - Text is probably the easiest to explain. Based on a text prompt the model creates a text output trying to match the users request. It is possible to include documents in the prompt. Models normally have a way of marking up the prompt to help them parse it. For example Claude allows you to submit text inside xml like tags. You can then create a promo to summarise a document. Claude can then differentiate between the instruction and the text to be summarised. There are also a few different capabilities of text models. Some are better suited to chat bot type responses and others to longer one-off answers. Also some text models can accept an image as an input. They are still text models as they provide a text based output
  • Image generating - These are models that accept text as an input and produce an image as the output. Often they will also accept an image as part of the input also.
  • Embedding - This is probably the most complex to explain. Embeddings are used to produce a numerical representation of data. This is a high-dimensional vector (hopefully you remember some maths here). The important bit is similar inputs produce similar outputs. Basically text with synonyms should generate very similar vectors. Vectors are useful for making text searchable. Vectors can also be used for images. Some of the most powerful GenAI application work by combining text or image retrieval with GenAI generation. If you are interested in this look into Retrieval Augmented Generation (RAG).

The playground

When you first navigate to the Bedrock in the AWS console you will first have a splash page (assuming it's available in your chosen region). Before you do anything you have to enable individual models by agreeing to their specific terms and conditions. After that you can go to the playground.

There are 3 distinct playground versions.

  • Image generation
  • Text
  • Chat

East version of the playground is for testing specific model functionality. The playground is the best way to play with the models and start to design your prompts. As an example the image playground can be used to generate an image. First select an appropriate model. Use the prompt to describe what type of image you want. You can then press run and images will be generated. By default 3 are produced. You can then select the version you want. You can also refine the image. There are various additional parameters you can change that will affect the output. They are all explained with info links, but you can also just have a play. Image generation does take a little longer than text.


Bedrock image playground

The playground is the best way to get an understanding of GenAI. It is also very useful for testing how different prompts work. Personally I think there is a progression, first test in the playground, second when you start to chain operations then use Sagemaker to create a notebook and then build a product.

Boto 3

Probably the most common way to access Bedrock will be using the Boto 3 library. That is the official languid for Python (the language of choice for data scientists among others).

The Boto 3 library is pretty simple to use.

A very simple example of using the Claude model is:

bedrock_runtime = boto3.client(service_name='bedrock-runtime', region_name=aws_region)

body = json.dumps({
    "anthropic_version": anthropic_version,
    "max_tokens": max_return_tokens,
    "system": system_message,
    "messages": [{
        "role": "user",
        "content": user_instructions,
    }],
    "temperature": temperature,
})

response = bedrock_runtime.invoke_model(
     body=body,
     accept='application/json',
     modelId=model_id,
     trace='ENABLED',
)

response_body = json.loads(response.get('body').read())        

There are a few variables you need to fill in but this is pretty much the minimum you need. This is obviously a very simplistic example with no error handling. It is also assuming you have IAM permissions set up. There are essentially 3 steps in the example above:

  1. Load the client
  2. Build the request
  3. Send the request to Bedrock (the only external call)

The Bedrock API makes working with different model easy. They have a common way of submitting requests. Also all the parameters are included in the request so easy to change.

The only issue is this is that many models have additional parameters and prompt formats. So it is not quite as easy to plug-and-play different models.

Choice of models

There are several models available from the big players. Different models are available in different regions. I have been told informally that older model versions will not be launched in newer regions. So Claude 2 is only available in the original launch regions whereas Claude 3 is been made available in all new Bedrock launch regions. If you have specific model requirements, check out the Bedrock documentation to check availability

  • Amazon Titan - AWS own models. One of the significant features is amazon indemnify you against copyright issues in the model training. If this is a specific concern I would highly recommend reading the terms.
  • Anthropic Claude - Anthropic was founded by some of the early break-aways from OpenAI. They have a strong relationship with AWS. They are probably the 2nd best known genie company.
  • Stable Diffusion XL - They focus on text to image and cn create high resolution photo-realistic images
  • Facebook/Meta Llama - Meta have a commitment to open model research and producing smaller highly performant models
  • Cohere Embed - Focussed on the embedding space
  • Mistral AI - Particularly good for chatbots and multi-language
  • AI21 Jamba/Jurassic - Text generation. Some with multi language support.

All the different models available have different capabilities. This includes the particular interactions they are suited to, languages supported, price and performance. The other feature to look out for is the number of tokens the model supports. If you want to have a large input prompt with lots of context, then you will need a model that supports more tokens.

Bedrock features

There are a few key features of Bedrock that are worth calling out:

  1. The Playground is the key UI in the AWS console. This and the API are how you interact with Bedrock
  2. The ability to create custom models by fine-tuning existing Fundamental models. Sometimes you can't achieve exactly what you need to using prompts alone (like custom vocabulary). You then need to adapt the model. Fine tuning is the process of training an existing model using additional data. This includes creating and hosting custom models.
  3. Model evaluation - Bedrock has two different model evaluation mechanisms. One is using automated model evaluation to assess accuracy of answers and some other features. The other is man evaluation with a mechanism to collect feedback.
  4. Guardrails - Consistent filters on both input and output to safeguard applications. This includes off-the-shelf guardrails like profanity filters as well as the ability to create your own custom guardrails
  5. Knowledge bases - Create a simple searchable knowledge base. Ingest documents from s3, Sharepoint or a web crawler and index them using embeddings. This is basically a vector index for an existing data store. This can then be consumed by agents (see below) or your own application
  6. Agents - Agents are essentially mini applications. They can be used to chain model invocations together to produce more complex results or manage context like chat history.

These are all on top of the 'core' bedrock inference functionality.

Pricing

Predicting the pricing of Bedrock is quite difficult. The pricing is transparent and easy to understand, but it is based on the number of tokens sent to models and the number of tokens generated in response. It is hard enough to predict exactly how many tokens you send to a model, but predicting the response size is impossible in many cases. You can limit response sizes but that will normally give you a maximum rather than an actual number.

In reality there are many AWS services where you have to do some cost modelling to have an idea of the cost, but by far the most accurate way is to predict a range and then reassess based on some actual usage.

Different models can have very different pricing. This is probably a combination of both the licensing (many of the models are 3rd party) and the vastly different compute requirements based on model complexity. It is well worth evaluating multiple models if several meet your requirements.

There is also the option to buy provisioned throughput. This can reduce costs. The saving seems to vary between models. Also provisioned throughput is not available for all models. I suspect this may reflect the element of compute involved and the different commercial relationships that AWS have brokered with the model producers (that's just my opinion).

Bedrock Availability

Bedrock is still quite a ne service and availability varies by region. Some features have not been launched in all regions. Also different models have different availability. For the lates info check out the official AWS docs.

Feature availability:

https://docs.aws.amazon.com/bedrock/latest/userguide/bedrock-regions.html

Model availability:

https://docs.aws.amazon.com/bedrock/latest/userguide/models-regions.html

Learnings

I have learnt a lot about GenAI while using Bedrock. I want to split my learnings into two parts

Bedrock

First Bedrock itself. Based on what I have done so far Bedrock is exactly what I need it to be. It works well. It is intuitive to use and works with common AWS functionality like IAM roles. It's easy to get started and it hides away all the resource management etc. It can be accessed using private endpoints. It also guarantees that my data stays private. All in all the only particularly exciting about Bedrock is the possibilities that it creates - I think that is the ultimate compliment.

GenAI

GenAI is both easy and difficult. It is easy to start and create value, but optimising a solution can be much more complex. The main thing is prompt engineering and using multiple inferences to complete an operation. Prompt engineering is about structuring requests

There are a few general techniques that apply to most models like using a system prompt to define a role, breaking tasks up into subtasks, and giving examples. You need to be specific about what you expect. If you want a greeting omitted or the output to only be code then you have to specify that. If there are multiple steps then you need to specify that.

There is a lot of knowledge you can gain about prompt engineering, but also there needs to be a lot of testing and evaluation.

Conclusion

I have really enjoyed building products with Bedrock. It has been quite easy to get started and build.

Prompt engineering is a very complex topic and there is a lot to learn. The model specific docs are the best place to start. A lot of models have documents with an explanation of how to structure prompts and often prompt libraries with concrete examples.

Also it is often necessary to use multiple model invocations to achieve the best result. Each prompt is specifically designed for a separate purpose like understanding the user input, creating the output or checking for accuracy and hallucinations. I am also lucky to have access to a team of amazing data scientists here at PA Consulting who have amassed a tonne of experience with numerous fundamental models already.

Bedrock is a great product and I look forward to using it for more real world applications.


Claude specific links (as an example):

https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview

https://docs.anthropic.com/en/prompt-library/library

要查看或添加评论,请登录

Andrew Larssen的更多文章

  • Measuring the cost of Bedrock

    Measuring the cost of Bedrock

    Amazon Bedrock is a great product but it does come with one slight problem - attributing costs. At a very high level…

    2 条评论
  • Claud 3.7 Sonnet - Could this change things?

    Claud 3.7 Sonnet - Could this change things?

    First let's start with the obvious. Anthropic Claude 3.

    1 条评论
  • GraphRAG - What's it all about?

    GraphRAG - What's it all about?

    A while ago all the hype in GenAI was about RAG (Retrieval Augmented Generation). RAG is a technique to give LLM (large…

  • DeepSeek on Bedrock - the story continues...

    DeepSeek on Bedrock - the story continues...

    Just over a week ago I wrote an article about running DeepSeek on Amazon Bedrock. This is a follow on piece.

  • RAG for video

    RAG for video

    I have been looking at producing a chatbot able to answer questions based on a company knowledge base. Ideally it would…

  • DeepSeek on AWS Bedrock

    DeepSeek on AWS Bedrock

    There is a lot of talk right now about DeepSeek. I am a bit scare about running any sort of model where I don't know…

  • Amazon Bedrock Model Distillation

    Amazon Bedrock Model Distillation

    Model distillation is quite a complex term. Before we look at the Bedrock product it is worth starting out by answering…

    1 条评论
  • ReInvent keynotes update

    ReInvent keynotes update

    There have been 2 keynotes so far. Monday Night Live with Peter DeSantis and the CEO keynote with new CEO Matt Garman.

  • AWS Resource Control Policies

    AWS Resource Control Policies

    In the last couple of weeks there have been a few announcements coming out of AWS. Normally at this time of year it…

  • Network security and AWS Transit Gateway

    Network security and AWS Transit Gateway

    There are a few ways you can improve your networking security using AWS Transit Gateway. If you are using AWS multi…

社区洞察

其他会员也浏览了