登录查看更多内容

Scaling Kubernetes Pods Automatically with the Vertical Pod Autoscaler

Christopher Adamson

Software Engineer, SRE at The Boeing Company

发布日期: 2024年1月3日

Vertical Pod Autoscaler (VPA) is a Kubernetes feature that automatically adjusts the CPU and memory requests and limits for pods based on usage. VPA allows pods to use resources just above their request, which improves resource utilization in your cluster. In this tutorial, we'll cover how to use VPA with a sample application.

Prerequisites

There are a few requirements to use VPA on your Kubernetes cluster:

Kubernetes Version

Your Kubernetes cluster must be version 1.8 or higher to run VPA. VPA uses features like pod resource metrics that require a minimum version.

Install VPA

You'll need to install the Vertical Pod Autoscaler into your cluster. The official installation instructions are here.

To install VPA you'll need to:

Deploy the VPA custom resource definitions
Deploy the VPA admission controller
Deploy the VPA recommender and updater components

We recommend reading the VPA installation guide thoroughly before deploying.

Metrics Server

The metrics-server monitoring component must be deployed on your cluster as well. This is because VPA relies on resource metrics (CPU/memory usage) from each pod to determine how to adjust resource requests/limits.

Without metrics-server, VPA has no way to fetch current pod resource consumption. Follow the metrics-server ?installation guide to deploy it.

Sample App

You'll need a sample application running on the cluster for VPA to manage. This can be any application - we use nginx in the tutorial for simplicity.

The key point is that the application should be deployed with CPU/memory requests specified. Otherwise, VPA won't have a baseline to compare against when adjusting resources.

VPA needs:

Kubernetes 1.8+
VPA installed
metrics-server for pod resource metrics
Sample app with CPU/memory requests

With these prerequisites met, you'll be ready to have VPA automatically manage pod resources!

Deploy the Sample Application

We need a simple application deployed to demonstrate how VPA works. For this tutorial, we'll deploy an nginx pod and service.

Create a file called nginx-deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80
        resources:
          requests:
            memory: "256Mi"
            cpu: "500m"
          limits:
            memory: "512Mi"
            cpu: "1000m"

This configures a simple nginx deployment with 1 replica. Notice it defines CPU and memory requests of 500 millicores and 256Mi?

We also configured resource limits of 1 CPU core and 512Mi memory. Resource limits are optional, but recommended to prevent any one pod from using too many cluster resources.

Next, create the nginx service:

apiVersion: v1
kind: Service
metadata:
  name: nginx-service
spec:
  selector:
    app: nginx
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80

This is a standard service to expose the nginx deployment.

Deploy both of these to your cluster:

kubectl apply -f nginx-deployment.yaml
kubectl apply -f nginx-service.yaml

You should now have a running nginx application with default CPU and memory requests of 500 millicores and 256Mi.

We can now create a VPA that will adjust these initial requests based on actual usage.

Create a VPA Resource

Now that nginx is running, we can create a VerticalPodAutoscaler resource to manage its resources.

Create a file called nginx-vpa.yaml:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: nginx-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: nginx-deployment
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
      - containerName: "*"
        minAllowed:
          cpu: 50m
          memory: 50Mi
        maxAllowed:
          cpu: 1
          memory: 500Mi

This configures a VPA with the following key points:

The targetRef points to our nginx-deployment as the target for this VPA
updateMode is Auto so changes happen immediately
A resourcePolicy defines min and max bounds for resources

The resourcePolicy sets a min CPU request of 50 millicores and min memory of 50Mi. This prevents VPA from setting requests too low.

It also sets maximums of 1 CPU core and 500Mi memory. This prevents VPA from setting requests too high.

Applying bounds prevents bad configurations from being set automatically.

Deploy the VPA:

kubectl apply -f nginx-vpa.yaml

The VPA will now start monitoring resource usage of the nginx pods, and scale the CPU/memory requests between the min and max bounds based on the collected usage metrics.

Verify VPA Changes Resource Requests

Once the VPA is running, it will start automatically adjusting the CPU and memory requests of the nginx pods based on monitored usage.

We can verify the changes by inspecting the nginx pods:

kubectl get pods
# Get nginx pod name
kubectl describe pod nginx-deployment-XXXXX

In the pod description, you should see the Requests section with new CPU and memory values:

领英推荐

? The Karpenter transformation, make the most of your…

Learnk8s 2 个月前

Tackling CPU Throttling in Kubernetes for Better…

Causely 3 个月前

seL4 on the RISC-V Rocket Chip Enables Virtualization…

DornerWorks 2 年前

Requests:
  cpu:      300m
  memory:   200Mi

In this example, VPA increased the CPU request from 500 millicores to 300 millicores. The memory request went from 256Mi to 200Mi.

These new requests are within the min/max bounds we configured in the VPA resourcePolicy. They reflect the actual usage that VPA detected.

The initial requests we specified acted as a starting point, which VPA then optimized based on real metrics. This ensures pods have just enough resources to run smoothly, without overprovisioning.

To further verify, you can view the VPA objects themselves:

kubectl get vpa

# Get recommendations
kubectl describe vpa nginx-vpa

?This shows the recommendations made by VPA and the actual resources set on the target pods.

Over time, VPA will continue monitoring usage and adjusting requests to balance resource utilization in your cluster.

Cleanup

Once you are done testing out VPA, you can delete the resources created in this tutorial:

Delete the VPA

Delete the VPA object itself:

kubectl delete vpa nginx-vpa

This will stop VPA from managing the nginx deployment.

Delete the Sample App

Delete the nginx deployment and service:

kubectl delete deployment nginx-deployment
kubectl delete service nginx-service

This removes the sample application from your cluster.

Remove VPA Installation

If you no longer need VPA, you can optionally uninstall it from your cluster:

# Delete VPA Pods
kubectl -n kube-system delete pods -l app=vpa

# Delete RBAC Resources
kubectl delete clusterrolebinding vpa-rbac
kubectl delete clusterrole vpa-rbac

# Delete Admission Controller 
kubectl delete -f vpa-admission-controller.yaml

# Delete CRDs
kubectl delete crd verticalpodautoscalers.autoscaling.k8s.io

?Follow the official VPA uninstall guide for more details.

This completes the cleanup of the VPA tutorial. You removed the sample application resources, deleted the VPA object, and optionally uninstalled VPA from the cluster.

Common use cases where Vertical Pod Autoscaler is helpful

Unpredictable Workloads

If you have workloads with variable traffic patterns, VPA can automatically scale up resources when demand spikes occur. This prevents performance issues without over-provisioning.

Multiple Environments

When deploying the same app to dev, test, and prod, VPA ensures each environment sizes resources appropriately based on actual usage in that environment.

Batch Jobs

For batch jobs that run periodically, VPA can scale resource requests high during the job, then down again after. So you use more resources while the job runs, without static over-provisioning.

Auto-Scaling

VPA pairs nicely with Horizontal Pod Autoscaler (HPA) for fully automated scaling. As your pods scale out, VPA makes sure each new pod is right-sized.

Skeletal Pods

You can deploy "skeleton" pods with very low resource requests, relying on VPA to size them correctly as usage ramps up.

Resource Optimization

Since VPA only allocates resources needed based on metrics, you can improve cluster utilization and reduce costs.

New Applications

When deploying a new app, VPA removes the guesswork around estimating resource needs. Just deploy and let VPA find the optimal requests.

VPA shines for workloads that are unpredictable, rapidly changing, or new. It reduces the need for manual resource tuning.

Additional resources on Vertical Pod Autoscaler

Official VPA Documentation - Detailed docs on VPA from the Kubernetes project.
VPA Recommendations Guide - How VPA calculates resource recommendations.
VPA Admission Controller - How the controller adjusts resources automatically.
Tuning VPA Policies - Best practices for VPA policies from Google Cloud.

Conclusion

In this tutorial, we covered the basics of using Vertical Pod Autoscaler to automatically adjust resource requests and limits for pods.

The key points are:

VPA lets pods use resources just above their requests, improving utilization
It monitors pod resource usage via metrics-server
You define a VPA resource that targets deployments to manage
VPA will automatically tweak CPU/memory requests based on actual usage
Bounds can be set to prevent bad configurations

With VPA, you don't need to manually tune pod resource requests as workloads change. VPA adapts requests dynamically based on real metrics.

This can improve cluster resource utilization, application performance, and automation. Resources are sized just-right for your workloads.

Of course, VPA is not a "set it and forget it" solution. You should still monitor its behavior and tweak bounds/policies as needed. But overall VPA can simplify resource management for your Kubernetes clusters.

要查看或添加评论，请登录

Christopher Adamson的更多文章

Docker init: Automate Your Docker Project Setup

2025年3月1日

Docker init: Automate Your Docker Project Setup

Docker has transformed the way developers build, package, and deploy applications by providing a consistent environment…
Simplify and Streamline Your Docker Builds with Bake

2025年2月23日

Simplify and Streamline Your Docker Builds with Bake

Docker Bake is a tool that simplifies complex Docker builds by allowing you to define multiple build targets (and…
Salt Project: Configuration Management and Automation Tool

2025年2月15日

Salt Project: Configuration Management and Automation Tool

The Salt Project is an open-source infrastructure automation and configuration management tool that enables efficient…

2 条评论
Testing Broken Kubernetes Configurations

2025年2月8日

Testing Broken Kubernetes Configurations

Troubleshooting Kubernetes misconfigurations can be a complex and time-consuming process. These broken Kubernetes…
Kubernetes Troubleshooting with AI-Powered Insights

2025年2月1日

Kubernetes Troubleshooting with AI-Powered Insights

K8sGPT is an AI-powered tool designed to streamline Kubernetes troubleshooting by providing intelligent analysis of…
Integrating Keycloak and SOPS with Kubernetes

2024年11月9日

Integrating Keycloak and SOPS with Kubernetes

In this tutorial, I'll guide you through integrating Keycloak (a open-source identity and access management solution)…
Deploying Keycloak on Kubernetes

2024年11月2日

Deploying Keycloak on Kubernetes

Keycloak is an open-source Identity and Access Management (IAM) tool that provides single sign-on, user federation, and…

1 条评论
Managing Encrypted Configuration with SOPS

2024年10月27日

Managing Encrypted Configuration with SOPS

Encrypting sensitive data is crucial for secure deployments, even in local environments. We'll cover encrypting secrets…
Keycloak: Identity and Access Management

2024年10月26日

Keycloak: Identity and Access Management

Keycloak is a open-source Identity and Access Management (IAM) solution that simplifies authentication and…
Alternatives to Kustomize for Kubernetes Configuration Management

2024年10月20日

Alternatives to Kustomize for Kubernetes Configuration Management

Kustomize is a tool within the Kubernetes ecosystem, designed to simplify configuration management by enabling users to…

See all articles

Scaling Kubernetes Pods Automatically with the Vertical Pod Autoscaler

Christopher Adamson

Software Engineer, SRE at The Boeing Company

Prerequisites

Deploy the Sample Application

Create a VPA Resource

Verify VPA Changes Resource Requests

领英推荐

Cleanup

Common use cases where Vertical Pod Autoscaler is helpful

Additional resources on Vertical Pod Autoscaler

Conclusion

Christopher Adamson的更多文章

社区洞察

其他会员也浏览了

Solving .NET Multiplatform Builds with Docker: A Real-World Example

FPGA-Based FTP Server Delivers 5400+ MB/s Speed

High-Performance Switching with Denovolab Class 4 Fusion: Achieving Exceptional Call Rates Using Parallel Computing and Advanced Optimization

Understanding Broadcom’s StrataXGS? Chipset Families: Trident vs. Tomahawk – The Smart vs. The Speedy

Clusterone's Kubernetes success with Devopsbay

x86 Architecture Servers vs. Hypothetical Apple M3 ARM Servers: A Deep Tech Analysis

Autoscaling Pitfalls to Avoid

Kubernetes + GPUs: The Backbone for AI-Enabling Workloads

6 Metrics To Watch for on Your K8s Cluster

How kubernetes(k8s) manage failsafe for any application

Prerequisites

Deploy the Sample Application

Create a VPA Resource

Verify VPA Changes Resource Requests

领英推荐

Cleanup

Common use cases where Vertical Pod Autoscaler is helpful

Additional resources on Vertical Pod Autoscaler

Conclusion

Christopher Adamson的更多文章

Docker init: Automate Your Docker Project Setup

Simplify and Streamline Your Docker Builds with Bake

Salt Project: Configuration Management and Automation Tool

Testing Broken Kubernetes Configurations

Kubernetes Troubleshooting with AI-Powered Insights

Integrating Keycloak and SOPS with Kubernetes

Deploying Keycloak on Kubernetes

Managing Encrypted Configuration with SOPS

Keycloak: Identity and Access Management

Alternatives to Kustomize for Kubernetes Configuration Management

社区洞察

其他会员也浏览了

Solving .NET Multiplatform Builds with Docker: A Real-World Example

FPGA-Based FTP Server Delivers 5400+ MB/s Speed

High-Performance Switching with Denovolab Class 4 Fusion: Achieving Exceptional Call Rates Using Parallel Computing and Advanced Optimization

Understanding Broadcom’s StrataXGS? Chipset Families: Trident vs. Tomahawk – The Smart vs. The Speedy

Clusterone's Kubernetes success with Devopsbay

x86 Architecture Servers vs. Hypothetical Apple M3 ARM Servers: A Deep Tech Analysis

Autoscaling Pitfalls to Avoid

Kubernetes + GPUs: The Backbone for AI-Enabling Workloads

6 Metrics To Watch for on Your K8s Cluster

How kubernetes(k8s) manage failsafe for any application