What you should know about autoscaling in Kubernetes
What you should know about autoscaling in Kubernetes
What is autoscaling?
Autoscaling is one of the reasons I fell in love with Kubernetes. Kubernetes is built to scale workloads with ease. But what is autoscaling?
Imagine having a party in your house and having all your family and friends attend. If you are from Nigeria, that number can easily get to 500. So you need approximately 500 chairs, 500 cutlery sets, glasses, 50 tables, and almost everything. So you go to the market and buy all these. On the “D” day as we call it, you find that only 200 people came, you have just wasted money on the stuff you bought for the extra 300. Or imagine you get 1000 people instead, then you don’t have enough to take care of that number.
A few days to the “D” day, you find that there is an event vendor close by from whom you can rent all these things. He asks you to pay for only the things you use. So you can take 200, and if your visitors exceed 200, you can rent more to make up for it. If the number is less, then you can return the ones you do not need, and you won’t need to pay for them. So you have saved yourself the disappointment of not having enough resources to cater to your guests, and the cost of buying or renting more than you require. This example illustrates the basic concept of scaling.
Scaling is the ability of a system to increase or decrease in size. Autoscaling is the ability of a system to scale based on resource demand or pressure without manual interference.
Cloud computing makes scaling very easy, and cost-effective. Before cloud computing, if you wanted to scale your resources, you would either need to buy physical memory, processors, or machines. Then you had to worry about space and maintenance. With cloud computing, you can scale your resources in seconds. With Kubernetes, it’s like magic.
We will look at 3 types of autoscaling
Vertical autoscaling (VPA)
In vertical autoscaling, we either increase or decrease the amount of CPU or memory assigned to a pod automatically.
A vertical autoscaler monitors the resource usage of a pod and recommends or updates the limits and requests of the resources based on the configuration passed. For example, if a pod is requesting 200M CPU and 500Mi memory and is using up to 350M CPU and 400Mi memory, the VPA can increase the requests to 500m CPU and 700M memory based on the configuration you pass.
Horizontal Pod Autoscaler (HPA)
A horizontal pod autoscaler automatically increases or decreases the number of pods of a deployment based on defined configurations.
The HPA monitors metrics such as CPU, Memory, and other custom metrics, and scales the number of pods based on a predefined threshold. For example, if the CPU usage of a pod exceeds 50%, the HPA may add more pods to distribute the load.
领英推荐
Cluster Autoscaler
The cluster autoscaler automatically increases or reduces the number of nodes running in a cluster based on the resource usage from the pods in the node.
One way the cluster autoscaler is triggered is the presence of unschedulable pods. Unschedulable pods are pods that do not meet the necessary conditions for them to be placed in any node in the cluster. But what could make a pod unschedulable?
There are several reasons why a pod could be unschedulable, but I will just mention a few:
Other things could make a pod unschedulable like taints and tolerations.
The cluster autoscaler checks if scheduling a new node will make the unschedulable pods schedulable again, and if it will, it creates a new node. Also, if the autoscaler notices nodes with free space that can contain other pods in another node and free the node, it moves the pods to the other nodes and then deletes the node it has freed.
What are the benefits of autoscaling?
What should you watch out for when setting up autoscaling?
ignoreDifferences:
- group: ""
kind: Deployment
jsonPointers:
- /spec/template/spec/containers/0/resources/requests
- /spec/template/spec/containers/0/resources/limits
Conclusion
Autoscaling in Kubernetes is a very powerful feature. It helps you manage your server resources efficiently. The different methods of autoscaling provide us with rich options for ensuring that our workloads are managed efficiently without manual intervention.
While autoscaling can be very useful for managing resources, it is important to implement it correctly. the autoscalers should be configured properly, with an understanding of potential issues while following best practices.
Please let leave your thoughts in the comment section. I want to hear from you.