Prometheus Triggers for the KEDA for the Kubernetes Autoscalling - KEDA 2/4
Vishwas N.
Sr.Solutions Engineer at Innatemetrics | Reinventing AI Acceleration for Enterprise(R2V2.ai) | Playing the Big Boy Sport(Startups)
You'll discover how to set up Keda to deploy a Kubernetes HPA that makes use of Prometheus metrics in this post.
You can see my session on Prometheus at Azure Developer Community
Also learn more from the Articles from OpsTree Solutions and Buildpiper - By OpsTree
The Kubernetes Horizontal Pod Autoscaler has the ability to scale pods based on how much CPU and memory are being used. This is helpful in many situations, but in some, more sophisticated metrics are required, such as the number of connections waiting in a web server or the latency in an API. Additionally, in some circumstances, you might need to aggregate data or integrate several metrics in a formula.
HPA - Horizontal Pod Autoscaler in Kubernetes
By utilising measurements from one of the Kubernetes metrics API endpoints, Kubernetes HPA may scale objects. Although Kubernetes HPA is incredibly beneficial, it has two significant drawbacks. The first is that combining metrics is not permitted. Combining various metrics can be useful in some circumstances, such as when figuring out connection utilisation by adding the current number of connections that have been established and the maximum number of connections.
The second restriction is the limited set of metrics that Kubernetes by default provides, which is limited to CPU and memory utilisation. Applications occasionally make more sophisticated metrics available, either directly or through exporters. You must publish them on the Kubernetes API metrics endpoint if you want to offer extra metrics.
integrating KEDA with measurements from Prometheus and HPA
An open-source solution called Keda makes it easier to use Prometheus metrics for Kubernetes HPA.
So refer to my previous article for the installation and tips and tricks of using KEDA
The method Keda uses?
The metrics server and HPA are both created by the Keda Kubernetes operator by specifying a ScaledObject Custom Resource Definition (CRD) object. You can specify what and how you want to scale with the help of this object.
What Scaling of Infrastructure can be taken into consideration?
You can scale the typical Kubernetes workloads, such as Deployments or StatefulSets, with Keda. Additionally, you may scale other CRDs; in fact, it contains a second CRD for scaling tasks.
领英推è
How to Scale accordingly?
The magic takes place here. In Keda, triggers may be defined, and there are many different kinds of them. The Prometheus trigger is the main topic of this essay.
You specify a Prometheus endpoint and a Prometheus query when configuring a Prometheus trigger for a ScaledObject. Keda queries your Prometheus server using that data to produce a measure in the Kubernetes external metrics API. Once the ScaledObject is created, Keda immediately generates the associated Kubernetes HPA.
I'm done now. You don't even need to bother about building the Kubernetes HPA object or reporting measurements to the Kubernetes API metrics endpoint!
Here is the Example and a Scenario that you can consider
Consider that you want an HPA for the deployment of nginx servers. Depending on the nginx connections waiting statistic from the Nginx exporter, you want it to scale from 1 to 5 copies. If there are more than 500 connections waiting, a new pod should be scheduled.
Let's understand by creating a query to trigger the HPA:
sum(nginx_connections_waiting{job="nginx"})
Simple, right? The nginx connections waiting metric value for the nginx job is all that this query gives.
The Prometheus query language, PromQL, has additional information available. Check out the PromQL cheat sheet in the getting started guide!
For this example, let's define the ScaledObject:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: nginx-scale
namespace: keda-hpa
spec:
scaleTargetRef:
kind: Deployment
name: nginx-server
minReplicaCount: 1
maxReplicaCount: 5
cooldownPeriod: 30
pollingInterval: 1
triggers:
- type: prometheus
metadata:
serverAddress: https://prometheus.monitoring:9090 # figure out where to update the metrices
metricName: nginx_connections_waiting_keda
query: |
sum(nginx_connections_waiting{job="nginx"})
threshold: "500"
The metricName parameter should be noted. This is a unique name you choose in order to get the value returned by the query. Keda obtains the query's response and uses it to build the nginx connections waiting keda statistic. The escalation is then started using this measure. Don't forget to modify the serverAddress as well.
The HPA will now function by just applying the ScaledObject specification.
What other offerings does Keda have?
Keda provides extra unique capabilities in addition to all the advantages of utilising the metrics in your Prometheus server and using Prometheus queries to mix them as you choose.
- In contrast to the default Kubernetes HPA, which only permits a minimum value equal to or higher than 1, it lets you scale down an item to zero.
- In the event that it is unable to obtain the value from the metric, such as when a connection fault occurs, it permits defining the number of replicas.
- It allows an authenticated secure connection to Prometheus endpoints.