Cost optimization in Kafka data pipeline
Vivek Anandaraman
Help Project Managers Estimate and Track AWS Cost during Build using Jira | Mentor | Speaker
Customer has event driven Architecture using kafka data pipelines. The producers and consumers are springboot Java applications.
Consumer A consumes from Topic T1 and then enriches and produces messages to Topic T2. Consumer B consumes messages from T1 and produces to T3. This goes on. Now there are more than one consumer in a Topic, so you can appreciate the kafka Data Flow Diagram, it will look like a node graph.
Considerations
Application is deployed in kubernetes but the k8s native Horizontal Pod Autoscaler feature is not turned on because this had introduced kafka partition rebalancing leading to consumer lags.
How to optimize this workload
On analyzing the consumer lags and message ingestion rates for the entire pipeline we found out that a vast majority of the time there were no messages to process in the topic. But the consumers were always up polling for new messages.
Scale in the consumers to zero when there is no consumer lag and scale out the consumers when there is a consumer message lag, this was implemented using KEDA. The projected savings for the customer is more than 30% of the on-prem hardware.
Create a Scaled Object (sample below)
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: kafka-scaledobject
namespace: default
spec:
scaleTargetRef:
name: kafka-application
pollingInterval: 30
triggers:
- type: kafka
metadata:
bootstrapServers: localhost:9092
consumerGroup: my-group
topic: test-topic
# Optional
lagThreshold: "50"
minReplicaCount: "5"
offsetResetPolicy: latest