Scaling Kubernetes Pods Automatically with the Vertical Pod Autoscaler
Vertical Pod Autoscaler (VPA) is a Kubernetes feature that automatically adjusts the CPU and memory requests and limits for pods based on usage. VPA allows pods to use resources just above their request, which improves resource utilization in your cluster. In this tutorial, we'll cover how to use VPA with a sample application.
Prerequisites
There are a few requirements to use VPA on your Kubernetes cluster:
Kubernetes Version
Your Kubernetes cluster must be version 1.8 or higher to run VPA. VPA uses features like pod resource metrics that require a minimum version.
Install VPA
You'll need to install the Vertical Pod Autoscaler into your cluster. The official installation instructions are here.
To install VPA you'll need to:
We recommend reading the VPA installation guide thoroughly before deploying.
Metrics Server
The metrics-server monitoring component must be deployed on your cluster as well. This is because VPA relies on resource metrics (CPU/memory usage) from each pod to determine how to adjust resource requests/limits.
Without metrics-server, VPA has no way to fetch current pod resource consumption. Follow the metrics-server ?installation guide to deploy it.
Sample App
You'll need a sample application running on the cluster for VPA to manage. This can be any application - we use nginx in the tutorial for simplicity.
The key point is that the application should be deployed with CPU/memory requests specified. Otherwise, VPA won't have a baseline to compare against when adjusting resources.
VPA needs:
With these prerequisites met, you'll be ready to have VPA automatically manage pod resources!
Deploy the Sample Application
We need a simple application deployed to demonstrate how VPA works. For this tutorial, we'll deploy an nginx pod and service.
Create a file called nginx-deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
resources:
requests:
memory: "256Mi"
cpu: "500m"
limits:
memory: "512Mi"
cpu: "1000m"
This configures a simple nginx deployment with 1 replica. Notice it defines CPU and memory requests of 500 millicores and 256Mi?
We also configured resource limits of 1 CPU core and 512Mi memory. Resource limits are optional, but recommended to prevent any one pod from using too many cluster resources.
Next, create the nginx service:
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
This is a standard service to expose the nginx deployment.
Deploy both of these to your cluster:
kubectl apply -f nginx-deployment.yaml
kubectl apply -f nginx-service.yaml
You should now have a running nginx application with default CPU and memory requests of 500 millicores and 256Mi.
We can now create a VPA that will adjust these initial requests based on actual usage.
Create a VPA Resource
Now that nginx is running, we can create a VerticalPodAutoscaler resource to manage its resources.
Create a file called nginx-vpa.yaml:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: nginx-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: nginx-deployment
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
- containerName: "*"
minAllowed:
cpu: 50m
memory: 50Mi
maxAllowed:
cpu: 1
memory: 500Mi
This configures a VPA with the following key points:
The resourcePolicy sets a min CPU request of 50 millicores and min memory of 50Mi. This prevents VPA from setting requests too low.
It also sets maximums of 1 CPU core and 500Mi memory. This prevents VPA from setting requests too high.
Applying bounds prevents bad configurations from being set automatically.
Deploy the VPA:
kubectl apply -f nginx-vpa.yaml
The VPA will now start monitoring resource usage of the nginx pods, and scale the CPU/memory requests between the min and max bounds based on the collected usage metrics.
Verify VPA Changes Resource Requests
Once the VPA is running, it will start automatically adjusting the CPU and memory requests of the nginx pods based on monitored usage.
We can verify the changes by inspecting the nginx pods:
kubectl get pods
# Get nginx pod name
kubectl describe pod nginx-deployment-XXXXX
In the pod description, you should see the Requests section with new CPU and memory values:
领英推荐
Requests:
cpu: 300m
memory: 200Mi
In this example, VPA increased the CPU request from 500 millicores to 300 millicores. The memory request went from 256Mi to 200Mi.
These new requests are within the min/max bounds we configured in the VPA resourcePolicy. They reflect the actual usage that VPA detected.
The initial requests we specified acted as a starting point, which VPA then optimized based on real metrics. This ensures pods have just enough resources to run smoothly, without overprovisioning.
To further verify, you can view the VPA objects themselves:
kubectl get vpa
# Get recommendations
kubectl describe vpa nginx-vpa
?This shows the recommendations made by VPA and the actual resources set on the target pods.
Over time, VPA will continue monitoring usage and adjusting requests to balance resource utilization in your cluster.
Cleanup
Once you are done testing out VPA, you can delete the resources created in this tutorial:
Delete the VPA
Delete the VPA object itself:
kubectl delete vpa nginx-vpa
This will stop VPA from managing the nginx deployment.
Delete the Sample App
Delete the nginx deployment and service:
kubectl delete deployment nginx-deployment
kubectl delete service nginx-service
This removes the sample application from your cluster.
Remove VPA Installation
If you no longer need VPA, you can optionally uninstall it from your cluster:
# Delete VPA Pods
kubectl -n kube-system delete pods -l app=vpa
# Delete RBAC Resources
kubectl delete clusterrolebinding vpa-rbac
kubectl delete clusterrole vpa-rbac
# Delete Admission Controller
kubectl delete -f vpa-admission-controller.yaml
# Delete CRDs
kubectl delete crd verticalpodautoscalers.autoscaling.k8s.io
?Follow the official VPA uninstall guide for more details.
This completes the cleanup of the VPA tutorial. You removed the sample application resources, deleted the VPA object, and optionally uninstalled VPA from the cluster.
Common use cases where Vertical Pod Autoscaler is helpful
Unpredictable Workloads
If you have workloads with variable traffic patterns, VPA can automatically scale up resources when demand spikes occur. This prevents performance issues without over-provisioning.
Multiple Environments
When deploying the same app to dev, test, and prod, VPA ensures each environment sizes resources appropriately based on actual usage in that environment.
Batch Jobs
For batch jobs that run periodically, VPA can scale resource requests high during the job, then down again after. So you use more resources while the job runs, without static over-provisioning.
Auto-Scaling
VPA pairs nicely with Horizontal Pod Autoscaler (HPA) for fully automated scaling. As your pods scale out, VPA makes sure each new pod is right-sized.
Skeletal Pods
You can deploy "skeleton" pods with very low resource requests, relying on VPA to size them correctly as usage ramps up.
Resource Optimization
Since VPA only allocates resources needed based on metrics, you can improve cluster utilization and reduce costs.
New Applications
When deploying a new app, VPA removes the guesswork around estimating resource needs. Just deploy and let VPA find the optimal requests.
VPA shines for workloads that are unpredictable, rapidly changing, or new. It reduces the need for manual resource tuning.
Additional resources on Vertical Pod Autoscaler
Conclusion
In this tutorial, we covered the basics of using Vertical Pod Autoscaler to automatically adjust resource requests and limits for pods.
The key points are:
With VPA, you don't need to manually tune pod resource requests as workloads change. VPA adapts requests dynamically based on real metrics.
This can improve cluster resource utilization, application performance, and automation. Resources are sized just-right for your workloads.
Of course, VPA is not a "set it and forget it" solution. You should still monitor its behavior and tweak bounds/policies as needed. But overall VPA can simplify resource management for your Kubernetes clusters.
?