登录查看更多内容

openEuler × DeepSeek 3: Containerized vLLM Deployment Guide

openEuler

An open source OS run by the OpenAtom Foundation, bolstered by diverse computing, and committed to a sustainable future.

发布日期: 2025年3月21日

Welcome back to our openEuler × DeepSeek series! ??

In our previous blog ??, we explored deploying vLLM with DeepSeek on openEuler using GPUs/CPUs. While effective, the setup process was relatively complex. Today, we're introducing a much simpler method that enables you to quickly deploy DeepSeek. The entire process consists of three straightforward steps:

Prepare the environment – Use a Kunpeng server or a server with NVIDIA GPUs.
Pull the image & start the container – Download the image and start a container with a single command.
Start DeepSeek – Access the container and initiate your AI inference journey.

?System Requirements

Before deployment, ensure your hardware meets the necessary specifications.

CPU Inference Requirements:

GPU Inference Requirements:

Deploying vLLM × DeepSeek on CPUs

To deploy and run inference using Kunpeng CPUs, follow these steps:

Pull the container image:

docker pull hub.oepkgs.net/neocopilot/deepseek_vllm:openeEuler2203-lts-sp4_cpu

Create and start a container:

docker run --name deepseek_kunpeng_cpu -it hub.oepkgs.net/neocopilot/deepseek_vllm:openeEuler2203-lts-sp4_cpu bash

Once deployed, you can interact with DeepSeek through the command line:

vllm serve /home/deepseek/model/DeepSeek-R1-Distill-Qwen-7B/ --max_model_len 32768 &

Explanation of Key Parameters:

/home/deepseek/model/DeepSeek-R1-Distill-Qwen-7B/ specifies the path to the preloaded model.
--max_model_len 32768 sets the maximum context length. Inputs exceeding this length will be truncated.

When you see the following output, it means your deployment is completed.

Deploying vLLM × DeepSeek on GPUs

To run inference on an NVIDIA GPU, follow these steps:

Pull the container image:

docker pull hub.oepkgs.net/neocopilot/deepseek_vllm:openeEuler2203-lts-sp4_gpu

Create and start a GPU-enabled container:

docker run --gpus all --name deepseek_kunpeng_gpu -it 7633dbb045f3 bash

Launch the vLLM Model Service:

vllm serve /home/deepseek/model/DeepSeek-R1-Distill-Qwen-7B/ --tensor-parallel-size 8 --max_model_len 32768 &

Explanation of Key Parameters:

--tensor-parallel-size 8 enables tensor parallelism across 8 GPUs. Adjust this based on your available hardware.

Testing Your Deployment

?? To make sure everything is running smoothly, try asking your model to tell you something about openEuler OS. Test your deployment with this simple curl command:

curl -X POST "https://localhost:8080/v1/chat/completions" \

-H "Content-Type: application/json" \

-d '{

"model": "deepseek-r1",

"messages": [{"role": "user", "content": "Tell me about openEuler OS."}]

}'

If everything is set up correctly, you'll receive a response with some cool insights about openEuler! :D

What's Next?

With this streamlined approach, deploying vLLM × DeepSeek on openEuler with CPUs or GPUs has never been easier. By using containerized deployment, you can set up and run AI inference in just a few minutes, enabling efficient scaling across different hardware architectures.

?? Stay tuned for our next openEuler × DeepSeek blog as we continue to explore AI deployment optimizations on openEuler!

Got questions or feedback? Feel free to reach out to us via the openEuler Intelligent SIG ??. Let's continue to innovate and build the future of AI together!

Quick Links for More openEuler × DeepSeek Blogs

?? DeepSeek-R1 671B Distributed Training Achieved on openEuler 24.03

?? openEuler × DeepSeek 1: Quick Deployment of DeepSeek-R1 on openEuler 24.03 LTS

?? openEuler × DeepSeek 2: vLLM Deployment Guide (CPU + GPU)

openEuler × DeepSeek 3: Containerized vLLM Deployment Guide

openEuler

An open source OS run by the OpenAtom Foundation, bolstered by diverse computing, and committed to a sustainable future.

?System Requirements

Deploying vLLM × DeepSeek on CPUs

Deploying vLLM × DeepSeek on GPUs

Testing Your Deployment

What's Next?

Quick Links for More openEuler × DeepSeek Blogs

openEuler Buzz

3,051 位关注者

openEuler的更多文章

?System Requirements

Deploying vLLM × DeepSeek on CPUs

Deploying vLLM × DeepSeek on GPUs

Testing Your Deployment

What's Next?

Quick Links for More openEuler × DeepSeek Blogs

openEuler Buzz

3,051 位关注者

openEuler的更多文章

Reproducible Builds for Supply Chain Security

EulerPublisher: Streamlining Automated Build, Testing, and Release of openEuler Images

openEuler × DeepSeek 2: vLLM Deployment Guide (CPU + GPU)

openEuler Monthly Bulletin – January

openEuler × DeepSeek 1: Quick Deployment of DeepSeek-R1 on openEuler 24.03 LTS

CPU Isolation by openEuler

DeepSeek-R1 671B Distributed Training Achieved on openEuler 24.03

An In-Depth Look at Sandbox Technologies

Safeguard Your OS

openEuler Monthly Bulletin – December