Kubernetes - Horizontal Pod Autoscaling (HPA)

Kubernetes - Horizontal Pod Autoscaling (HPA)


Horizontal Pod Autoscaling (HPA) automatically creates replicas in a deployment or replica set based on metrics like CPU utilization, memory usage etc based on the traffic at any given point of time and scales down when the traffic is less.


Components of HPA:

  1. Metrics Server: Needed for resource metrics (CPU/Memory) from pods. This data is used by HPA to make scaling decisions.
  2. HPA Resource: Defines the scaling policy (e.g., CPU utilization, Min/Max replicas).
  3. Pods with Resource Limits: Pods must specify CPU/memory requests and limits in their configuration for HPA to calculate.


Lets us now implement HPA by following the below steps:

Step 1: Kind installation - you can refer this article for installation - https://www.dhirubhai.net/pulse/kubernetes-install-kind-create-multi-node-cluster-kiran-kulkarni-2ya8c

Step 2: Install the metrics API.

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml        

Step 3: Edit the Deployment metrics-server

kubectl -n kube-system edit deployment metrics-server        

Add the security bypass to deployment under container.args

Note: This is only done for development or testing environment

- --kubelet-insecure-tls        

Step 4: Restart the metrics server

kubectl -n kube-system rollout restart deployment metrics-server        
kubectl get pods -n kube-system        

Note: You can check that metrics-server-744898d85-g7pxf pod is created

kubectl top nodes        

Step 5: Create the Apache deployment apache-deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: apache-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: apache
  template:
    metadata:
      labels:
        app: apache
    spec:
      containers:
      - name: apache
        image: httpd:2.4
        ports:
        - containerPort: 80
        resources:
          requests:
            cpu: 100m
          limits:
            cpu: 200m        
kubectl apply -f apache-deployment.yml        

Step 9: Create the HPA apache-hpa.yml

#apache-hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: apache-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: apache-deployment
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 2        

Step 10: Create the load generator and test

kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while true; do wget -q -O- https://apache-deployment.default.svc.cluster.local; done"        

Step 11: Watch the automatic scaling of pods - Open this in another terminal to see the imapact on the server

kubectl get hpa php-apache -w        

Step 12: Watch the automatic scaling down of pods

Note: Check it after 4-5 mins as it does not scales down immediately. This prevents Rapid scaling.



要查看或添加评论,请登录

Kiran Kulkarni的更多文章

社区洞察

其他会员也浏览了