Hands on with AWS Bedrock
Introduction
I have had the opportunity recently to work on my first couple of products with Bedrock. While I have been playing with it for a while (before general availability), I had not used it in a full system.
What is Bedrock
Amazon Bedrock is the AWS GenAI hub. I hosts a number of GenAI models as well as many other related GenAI features.
Types of GenAI
There are 3 basic 'modes' of GenAI models
The playground
When you first navigate to the Bedrock in the AWS console you will first have a splash page (assuming it's available in your chosen region). Before you do anything you have to enable individual models by agreeing to their specific terms and conditions. After that you can go to the playground.
There are 3 distinct playground versions.
East version of the playground is for testing specific model functionality. The playground is the best way to play with the models and start to design your prompts. As an example the image playground can be used to generate an image. First select an appropriate model. Use the prompt to describe what type of image you want. You can then press run and images will be generated. By default 3 are produced. You can then select the version you want. You can also refine the image. There are various additional parameters you can change that will affect the output. They are all explained with info links, but you can also just have a play. Image generation does take a little longer than text.
The playground is the best way to get an understanding of GenAI. It is also very useful for testing how different prompts work. Personally I think there is a progression, first test in the playground, second when you start to chain operations then use Sagemaker to create a notebook and then build a product.
Boto 3
Probably the most common way to access Bedrock will be using the Boto 3 library. That is the official languid for Python (the language of choice for data scientists among others).
The Boto 3 library is pretty simple to use.
A very simple example of using the Claude model is:
bedrock_runtime = boto3.client(service_name='bedrock-runtime', region_name=aws_region)
body = json.dumps({
"anthropic_version": anthropic_version,
"max_tokens": max_return_tokens,
"system": system_message,
"messages": [{
"role": "user",
"content": user_instructions,
}],
"temperature": temperature,
})
response = bedrock_runtime.invoke_model(
body=body,
accept='application/json',
modelId=model_id,
trace='ENABLED',
)
response_body = json.loads(response.get('body').read())
There are a few variables you need to fill in but this is pretty much the minimum you need. This is obviously a very simplistic example with no error handling. It is also assuming you have IAM permissions set up. There are essentially 3 steps in the example above:
The Bedrock API makes working with different model easy. They have a common way of submitting requests. Also all the parameters are included in the request so easy to change.
The only issue is this is that many models have additional parameters and prompt formats. So it is not quite as easy to plug-and-play different models.
Choice of models
There are several models available from the big players. Different models are available in different regions. I have been told informally that older model versions will not be launched in newer regions. So Claude 2 is only available in the original launch regions whereas Claude 3 is been made available in all new Bedrock launch regions. If you have specific model requirements, check out the Bedrock documentation to check availability
All the different models available have different capabilities. This includes the particular interactions they are suited to, languages supported, price and performance. The other feature to look out for is the number of tokens the model supports. If you want to have a large input prompt with lots of context, then you will need a model that supports more tokens.
Bedrock features
There are a few key features of Bedrock that are worth calling out:
These are all on top of the 'core' bedrock inference functionality.
Pricing
Predicting the pricing of Bedrock is quite difficult. The pricing is transparent and easy to understand, but it is based on the number of tokens sent to models and the number of tokens generated in response. It is hard enough to predict exactly how many tokens you send to a model, but predicting the response size is impossible in many cases. You can limit response sizes but that will normally give you a maximum rather than an actual number.
In reality there are many AWS services where you have to do some cost modelling to have an idea of the cost, but by far the most accurate way is to predict a range and then reassess based on some actual usage.
Different models can have very different pricing. This is probably a combination of both the licensing (many of the models are 3rd party) and the vastly different compute requirements based on model complexity. It is well worth evaluating multiple models if several meet your requirements.
There is also the option to buy provisioned throughput. This can reduce costs. The saving seems to vary between models. Also provisioned throughput is not available for all models. I suspect this may reflect the element of compute involved and the different commercial relationships that AWS have brokered with the model producers (that's just my opinion).
Bedrock Availability
Bedrock is still quite a ne service and availability varies by region. Some features have not been launched in all regions. Also different models have different availability. For the lates info check out the official AWS docs.
Feature availability:
Model availability:
Learnings
I have learnt a lot about GenAI while using Bedrock. I want to split my learnings into two parts
Bedrock
First Bedrock itself. Based on what I have done so far Bedrock is exactly what I need it to be. It works well. It is intuitive to use and works with common AWS functionality like IAM roles. It's easy to get started and it hides away all the resource management etc. It can be accessed using private endpoints. It also guarantees that my data stays private. All in all the only particularly exciting about Bedrock is the possibilities that it creates - I think that is the ultimate compliment.
GenAI
GenAI is both easy and difficult. It is easy to start and create value, but optimising a solution can be much more complex. The main thing is prompt engineering and using multiple inferences to complete an operation. Prompt engineering is about structuring requests
There are a few general techniques that apply to most models like using a system prompt to define a role, breaking tasks up into subtasks, and giving examples. You need to be specific about what you expect. If you want a greeting omitted or the output to only be code then you have to specify that. If there are multiple steps then you need to specify that.
There is a lot of knowledge you can gain about prompt engineering, but also there needs to be a lot of testing and evaluation.
Conclusion
I have really enjoyed building products with Bedrock. It has been quite easy to get started and build.
Prompt engineering is a very complex topic and there is a lot to learn. The model specific docs are the best place to start. A lot of models have documents with an explanation of how to structure prompts and often prompt libraries with concrete examples.
Also it is often necessary to use multiple model invocations to achieve the best result. Each prompt is specifically designed for a separate purpose like understanding the user input, creating the output or checking for accuracy and hallucinations. I am also lucky to have access to a team of amazing data scientists here at PA Consulting who have amassed a tonne of experience with numerous fundamental models already.
Bedrock is a great product and I look forward to using it for more real world applications.
Claude specific links (as an example):