Karpenter: Streamlining Kubernetes Scaling & Cost Efficiency
Tamjidul Islam
Cloud Engineer @bkash | DevOps | DevSecOps | FinTech | AWS | Kubernetes | Terraform | Python | Automation | CI/CD
Karpenter is an open-source Kubernetes node provisioning tool that interacts directly with Amazon EC2 to provision instances for your cluster using EC2 APIs. This allows for fine-grained control over the EC2 instances you want to provision, including AMI, instance types, and other configurations. Configuration is managed through two YAML files:
Nodepool.yaml file :
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: "karpenter-nodepool"
namespace: {{ .Release.Namespace }}
spec:
template:
spec:
# nodeClassRef defines the EC2 instance configuration for the node pool.
# It references a custom EC2NodeClass to specify instance types, AMI, and other instance properties.
nodeClassRef:
name: "karpenter-nodeclass-al2023" # Name of the EC2NodeClass template
kind: EC2NodeClass # The kind of node template
group: karpenter.k8s.aws # Group for Karpenter's EC2 node management
# Requirements specify constraints for the instances to be provisioned
requirements:
- key: "karpenter.k8s.aws/instance-category"
operator: In
values: ["c", "m", "r"] # Instance families: compute, memory, or storage optimized
- key: "karpenter.k8s.aws/instance-cpu"
operator: In
values: ["4", "8", "16", "32"] # CPU requirements
- key: "kubernetes.io/arch"
operator: In
values: ["arm64", "amd64"] # Architecture requirement (arm or x86)
- key: "karpenter.sh/capacity-type"
operator: In
values: ["spot"] # Use spot instances for cost efficiency
# Disruption settings help control how Karpenter handles node scaling and removal
disruption:
# Time to wait before consolidating pods onto fewer nodes (e.g., after a pod moves)
consolidateAfter: 1m
# Defines the consolidation policy: consolidate when nodes are empty or underutilized
consolidationPolicy: WhenEmptyOrUnderutilized
# Defines when the node should expire and be deleted, if unused
expireAfter: Never # No expiry; nodes stay until explicitly removed
2. AWSNodeTemplate (NodeClass): This file serves as a template for the node’s configuration, specifying details like the AMI, tags, and other instance properties Karpenter should apply when launching instances.
Nodeclass.yaml file:
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
name: "karpenter-nodeclass-al2023" # Name of the EC2 NodeClass (template) for Karpenter
namespace: {{ $.Release.Namespace }} # Namespace where the NodeClass is defined (parameterized for Helm)
spec:
# amiFamily specifies the base Amazon Machine Image (AMI) family to use for instances.
# AL2023 refers to Amazon Linux 2023, which is the default AMI family in this example.
amiFamily: AL2023
# role specifies the IAM role that the EC2 instances will assume. It should have the necessary
# permissions for nodes to interact with the Kubernetes cluster.
role: "{{ .Values.PROJECT_NAME }}-eks-worker-role" # The IAM role for the worker nodes (parameterized for Helm)
# subnetSelectorTerms are used to define which subnets Karpenter will use to launch instances.
# Here, it uses tags to match subnets associated with the cluster.
subnetSelectorTerms:
- tags:
karpenter.sh/discovery: "{{ .Values.CLUSTER_NAME }}" # Subnet tags for discovery of the cluster
# securityGroupSelectorTerms selects the security groups that should be applied to instances
# based on tags. This ensures that the nodes are properly networked.
securityGroupSelectorTerms:
- tags:
karpenter.sh/discovery: "{{ .Values.CLUSTER_NAME }}" # Security group tags for discovery of the cluster
# metadataOptions defines how metadata is accessed from the EC2 instances.
metadataOptions:
httpTokens: required # Requires tokens for IMDSv2 (Instance Metadata Service v2)
httpPutResponseHopLimit: 2 # Limits the number of hops for metadata responses
httpEndpoint: enabled # Enables the metadata endpoint on the instances
# tags apply to the EC2 instances that Karpenter provisions, making it easy to identify and manage them.
tags:
Name: "{{ .Values.PROJECT_NAME }}-karpenter-worker-node" # Custom name for the worker nodes (parameterized for Helm)
# blockDeviceMappings defines the storage settings for the EC2 instances, such as the size and type of the root volume.
blockDeviceMappings:
- deviceName: /dev/xvda # The root device name
ebs:
volumeSize: 30Gi # Size of the EBS volume (30GB)
volumeType: gp3 # General-purpose SSD storage type (gp3)
iops: 3000 # Provisioned IOPS (input/output operations per second)
deleteOnTermination: true # Deletes the volume when the instance is terminated?
Add the following affinity configuration to your deployment file to schedule your application on Karpenter-managed nodes:
领英推荐
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: karpenter.sh/nodepool
operator: In
values:
- karpenter-nodepool
The beauty of Karpenter lies in its ability to do more than just scale. It’s a powerful tool for cost optimization by efficiently utilizing resources and reducing unnecessary expenses.?
Cost Optimization: Karpenter enables better utilization of worker nodes, reducing costs. For example, if you have five nodes and two of them are underutilized, running only a few pods, you can configure a consolidation policy in the Provisioner. This policy enables Karpenter to automatically move pods from one underutilized node to another, allowing it to delete any nodes that are no longer needed. Additionally, Karpenter can launch cost-optimized nodes when it detects that the workload could be scheduled on a cheaper instance type, further reducing operational costs.
?Before running workloads with Karpenter, it’s crucial to deploy system-critical pods, such as CoreDNS, Ingress Controller, and ExternalDNS, on EKS-managed nodes. To ensure these critical components are isolated from Karpenter-managed nodes, use taints, tolerations, and affinity rules to place them on EKS-managed nodes. Additionally, Karpenter’s own pods should also be deployed on EKS-managed nodes.
This setup helps prevent interruptions to essential services. If Karpenter uses spot instances to launch pods, nodes may terminate unexpectedly, risking downtime for both critical components and Karpenter itself. By keeping these essential services on separate stable nodes, you reduce the chance of unavailability.
What about your application, which serves many users? Your application could also become unavailable due to the sudden termination of spot instances. To mitigate this risk, you can deploy AWS Node Termination Handler.
?Aws-node-termination-handler includes two components:
????????????? 1.???????? Instance Metadata Service Monitor: Run node-termination-handler as daemonset to monitor IMDS paths (like /spot or /events) and respond by draining or cordoning the node when necessary.
????????????? 2.???????? Queue Processor: Monitors an SQS queue for events from Amazon EventBridge, including ASG lifecycle events, EC2 status changes, Spot Interruption Notices, and Rebalance Recommendations. When it detects an impending instance shutdown, it uses Kubernetes API to cordon and drain the node. The Queue Processor requires IAM permissions to interact with SQS queue and EC2 API.
You can use Instance Metadata Service Monitor to avoid the need for creating an SQS queue or EventBridge