Still Struggling With Scaling ?
'Horizontal Pod AutoScaler' made the scaling far superior than before. The name 'AutoScaler' itself suggests that it'll be taken care at runtime as per increase in load / demand. Horizontal scaling means that the response to increased load is to deploy more Pods. Key features of this concept are as follows :-
~ The Horizontal Pod AutoScaler is implemented as a Kubernetes API resource and a controller.
~ It accesses corresponding workload resources that support scaling and these resources have their own sub resources called as 'scale'.
~ Sometimes, while dealing with Horizontal Pod AutoScaler, the number of replicas keeps fluctuating frequently due to the dynamic nature of inputs / demands. This concept is termed as 'Thrashing / Flapping'.
~ Kubernetes manages the workload by placing containers into Pods to run on Nodes. A node may be a virtual or physical machine, depending on the cluster.
~ Kubernetes provides the detailed documentation in which all the concepts are discussed. Feel free to access such valuable source : Kubernetes Documentation
Q. What happens if the load decreases ?
A. In case the demand / load decreases at runtime, the Horizontal Pod AutoScaler will instructs to workload resource to configure the number of pods to minimum or to the required number.