DeepSeek on AWS Bedrock
There is a lot of talk right now about DeepSeek. I am a bit scare about running any sort of model where I don't know where my data is going (this includes public for free models like OpenAI). So I want to experiment with DeepSeek in the safest way possible. For me that is using a Llama compatible version in AWS Bedrock.
Why is DeepSeek so exciting?
DeepSeek has created market turmoil in the tech sector and a slew of headlines but what is different about this large language model? There are a few things:
To be able to intelligently comment on DeepSeek though I want to test it. First lets get the model set up...
Setup
The best way to set up DeepSeek is using a Jupyter notebook in Sagemaker. This is far more efficient than downloading the model locally and the uploading it to s3.
The total model size is about 17GB. Most of this is the model weights (split across 2 files)
Before you start you need a working sagemaker studio notebook that can download files and upload to s3. Because of the size of the model, make sure you have enough space. The default 5GB file system will not cut it. I set mine to 30GB (probably bigger than I needed).
You will also need a s3 bucket you can upload to.
Step 1 - install dependancies (you may not need this if using Sagemaker but it does not hurt)
!pip install huggingface_hub boto3
Step 2 - Download the model from hugging face
from huggingface_hub import snapshot_download
model_id = "deepseek-ai/DeepSeek-R1-Distill-Llama-8B"
local_dir = snapshot_download(repo_id=model_id, local_dir="DeepSeek-R1-Distill-Llama-8B")
Step 3 - Upload to s3
Set the name of your s3 bucket
import boto3
import os
s3_client = boto3.client('s3', region_name='us-east-1')
local_directory = 'DeepSeek-R1-Distill-Llama-8B'
bucket_name = '...'
for root, dirs, files in os.walk(local_directory):
for file in files:
local_path = os.path.join(root, file)
s3_key = local_path
s3_client.upload_file(local_path, bucket_name, local_path)
Step 4 - create a custom model import job
Note: You can't upload a model from the root of a s3 bucket. The job will fail after about 6 minutes.
Step 5 - experiment via playground
In the Imported Models section make sure you are on the models tab. Click on the name of your model and then click on open in playground.
Note: If you are too quick you will get a red banner about the model not being available yet. Your tea break was not quite lone enough. Wait a little while and try again.
One final point - This is using the smaller Llama model available. If you change all 8B references to 70B you will get the larger, more capable model. The only caveat is that one is 150GB. You will need to increase the maximum space allowed in the Sagemaker domain storage settings and then kick off a space with around 160GB for the download to work.
Evaluation I
First it is worths saying there are two Llama based version to try. I have chosen the smaller 8B version. I am going to do further evaluation with the 70B version later.
领英推荐
I have tried DeepSeek with a few different types of query. I have so far been impressed by the answers. Here are some of the things I have been trying:
In my initial evaluation I have deliberately used very na?ve prompts. The goal was to use something that was easy to demonstrate. I have also tried using some of the prompt styes in the DeepSeek documentation on Github and experimented with some code generation.
Evaluation II
Based on my experience I decide I really had to try the larger model as there were a few items where it had not performed well. I tried again with the 70B version.
I have also experimented with a few other tasks like summarising or code generation. It does seem to do very well at reasoning tasks rather than 'tell me about X' tasks where it may lack context.
Hopefully over the next few days I will be able to experiment more and do some direct comparisons.
My thoughts on DeepSeek
I have tried both the Llama distil versions of DeepSeek. So far I have been quite impressed. There are a few glitches. I have only been looking at the Llama distil versions which are significantly smaller than the actual r1 model. For the size though they are very impressive.
The Llama 8B distil version is pretty portable at that size!
There are a couple of interesting things they have done that has made this model as performant as it is:
One of the main things is how much they have achieved with a much smaller budget and resources. I think this may have 2 long term outcomes. The first is they have shown it is possible for more nimble smaller players to enter the market. The second is LLM model development and inference will become more efficient. They have also done this over a couple of months.
As all of this is open source (both the model and training approach) other model providers will be incorporating these techniques into their own model training. I would be surprised if there were not improvements in other models in the next few weeks.
My thoughts on Bedrock model import
Model import is a new Bedrock feature. It is only available in a limited number of regions. It supports a few model families (Flan T5, Llama or Mixtral). It is really easy to upload a model that from HuggingFace or a model you have customised locally. Overall it's very easy to use and you can start using compatible models with the benefit of the Bedrock managed infrastructure. It's a great new addition to Bedrock and has made this evaluation of DeepSeek very easy.
One issue that has cause a few problems is the model not being available. If a model is not used for a while then it appears it is no longer cached and has to be retrieved from colder storage. You could then catch the exception and wait a while, but this would not be any use for a service that requires an instant response (like a chatbot). This may change as the feature becomes more widely available.
If you are using custom imported models for any interactive applications you really need to used provisioned throughput. Unfortunately this does not appear to be an option for imported models yet. As imported models are a new feature, I expect it to improve over the course of the year some of the features that I hope to see are:
I am looking forward to other projects where I can use it!
Credits
I have based my work on this AWS community blog although I have had to make a couple of tweaks to get its to work.
Also a big thank you to my colleague Tom Carmichael for the idea!