登录查看更多内容

Karpenter: Streamlining Kubernetes Scaling & Cost Efficiency

Tamjidul Islam

Cloud Engineer @bkash | DevOps | DevSecOps | FinTech | AWS | Kubernetes | Terraform | Python | Automation | CI/CD

发布日期: 2024年11月9日

Karpenter is an open-source Kubernetes node provisioning tool that interacts directly with Amazon EC2 to provision instances for your cluster using EC2 APIs. This allows for fine-grained control over the EC2 instances you want to provision, including AMI, instance types, and other configurations. Configuration is managed through two YAML files:

Provisioner (NodePool): This configuration file defines the types of EC2 instances Karpenter can provision, specifying parameters such as instance family, instance size, capacity type, and availability zones. When a pod is in pending state, kube-scheduler communicates with Karpenter, which then selects and launches the best-suited instance based on the pod’s requirements.

Nodepool.yaml file :

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: "karpenter-nodepool"
  namespace: {{ .Release.Namespace }}
spec:
  template:
    spec:
      # nodeClassRef defines the EC2 instance configuration for the node pool.
      # It references a custom EC2NodeClass to specify instance types, AMI, and other instance properties.
      nodeClassRef:
        name: "karpenter-nodeclass-al2023"  # Name of the EC2NodeClass template
        kind: EC2NodeClass                 # The kind of node template
        group: karpenter.k8s.aws           # Group for Karpenter's EC2 node management

      # Requirements specify constraints for the instances to be provisioned
      requirements:
        - key: "karpenter.k8s.aws/instance-category"
          operator: In
          values: ["c", "m", "r"]  # Instance families: compute, memory, or storage optimized
        - key: "karpenter.k8s.aws/instance-cpu"
          operator: In
          values: ["4", "8", "16", "32"]  # CPU requirements
        - key: "kubernetes.io/arch"
          operator: In
          values: ["arm64", "amd64"]  # Architecture requirement (arm or x86)
        - key: "karpenter.sh/capacity-type"
          operator: In
          values: ["spot"]  # Use spot instances for cost efficiency

     # Disruption settings help control how Karpenter handles node scaling and removal
     disruption:
        # Time to wait before consolidating pods onto fewer nodes (e.g., after a pod moves)
        consolidateAfter: 1m  
        # Defines the consolidation policy: consolidate when nodes are empty or underutilized
        consolidationPolicy: WhenEmptyOrUnderutilized  
        # Defines when the node should expire and be deleted, if unused
        expireAfter: Never  # No expiry; nodes stay until explicitly removed

2. AWSNodeTemplate (NodeClass): This file serves as a template for the node’s configuration, specifying details like the AMI, tags, and other instance properties Karpenter should apply when launching instances.

Nodeclass.yaml file:

apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: "karpenter-nodeclass-al2023"  # Name of the EC2 NodeClass (template) for Karpenter
  namespace:  {{ $.Release.Namespace }}  # Namespace where the NodeClass is defined (parameterized for Helm)

spec:
  # amiFamily specifies the base Amazon Machine Image (AMI) family to use for instances.
  # AL2023 refers to Amazon Linux 2023, which is the default AMI family in this example.
  amiFamily: AL2023  

  # role specifies the IAM role that the EC2 instances will assume. It should have the necessary
  # permissions for nodes to interact with the Kubernetes cluster.
  role: "{{ .Values.PROJECT_NAME }}-eks-worker-role"  # The IAM role for the worker nodes (parameterized for Helm)

  # subnetSelectorTerms are used to define which subnets Karpenter will use to launch instances.
  # Here, it uses tags to match subnets associated with the cluster.
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: "{{ .Values.CLUSTER_NAME }}"  # Subnet tags for discovery of the cluster

  # securityGroupSelectorTerms selects the security groups that should be applied to instances
  # based on tags. This ensures that the nodes are properly networked.
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: "{{ .Values.CLUSTER_NAME }}"  # Security group tags for discovery of the cluster

  # metadataOptions defines how metadata is accessed from the EC2 instances.
  metadataOptions:
    httpTokens: required  # Requires tokens for IMDSv2 (Instance Metadata Service v2)
    httpPutResponseHopLimit: 2  # Limits the number of hops for metadata responses
    httpEndpoint: enabled  # Enables the metadata endpoint on the instances

  # tags apply to the EC2 instances that Karpenter provisions, making it easy to identify and manage them.
  tags:
    Name: "{{ .Values.PROJECT_NAME }}-karpenter-worker-node"  # Custom name for the worker nodes (parameterized for Helm)

  # blockDeviceMappings defines the storage settings for the EC2 instances, such as the size and type of the root volume.
  blockDeviceMappings:
    - deviceName: /dev/xvda  # The root device name
      ebs:
        volumeSize: 30Gi  # Size of the EBS volume (30GB)
        volumeType: gp3  # General-purpose SSD storage type (gp3)
        iops: 3000  # Provisioned IOPS (input/output operations per second)
        deleteOnTermination: true  # Deletes the volume when the instance is terminated?

Add the following affinity configuration to your deployment file to schedule your application on Karpenter-managed nodes:

领英推荐

Amazon Elastic Container Service -Simplifying…

Cloud Parallax 2 个月前

Containers on AWS (EKS vs ECS)

Andrew Larssen 6 个月前

Github self hosted with pool of ec2 instances

Jagan Rajagopal AWS Certified Solution Associate ,Aws Coach Jagan ,Azure ,Terraform 5 个月前

      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: karpenter.sh/nodepool
                    operator: In
                    values:
                      - karpenter-nodepool

The beauty of Karpenter lies in its ability to do more than just scale. It’s a powerful tool for cost optimization by efficiently utilizing resources and reducing unnecessary expenses.?

Cost Optimization: Karpenter enables better utilization of worker nodes, reducing costs. For example, if you have five nodes and two of them are underutilized, running only a few pods, you can configure a consolidation policy in the Provisioner. This policy enables Karpenter to automatically move pods from one underutilized node to another, allowing it to delete any nodes that are no longer needed. Additionally, Karpenter can launch cost-optimized nodes when it detects that the workload could be scheduled on a cheaper instance type, further reducing operational costs.

?Before running workloads with Karpenter, it’s crucial to deploy system-critical pods, such as CoreDNS, Ingress Controller, and ExternalDNS, on EKS-managed nodes. To ensure these critical components are isolated from Karpenter-managed nodes, use taints, tolerations, and affinity rules to place them on EKS-managed nodes. Additionally, Karpenter’s own pods should also be deployed on EKS-managed nodes.

This setup helps prevent interruptions to essential services. If Karpenter uses spot instances to launch pods, nodes may terminate unexpectedly, risking downtime for both critical components and Karpenter itself. By keeping these essential services on separate stable nodes, you reduce the chance of unavailability.

What about your application, which serves many users? Your application could also become unavailable due to the sudden termination of spot instances. To mitigate this risk, you can deploy AWS Node Termination Handler.

?Aws-node-termination-handler includes two components:

????????????? 1.???????? Instance Metadata Service Monitor: Run node-termination-handler as daemonset to monitor IMDS paths (like /spot or /events) and respond by draining or cordoning the node when necessary.

????????????? 2.???????? Queue Processor: Monitors an SQS queue for events from Amazon EventBridge, including ASG lifecycle events, EC2 status changes, Spot Interruption Notices, and Rebalance Recommendations. When it detects an impending instance shutdown, it uses Kubernetes API to cordon and drain the node. The Queue Processor requires IAM permissions to interact with SQS queue and EC2 API.

You can use Instance Metadata Service Monitor to avoid the need for creating an SQS queue or EventBridge

要查看或添加评论，请登录

Tamjidul Islam的更多文章

terraform init -backend-config

2024年11月29日

terraform init -backend-config

Understanding terraform init and Remote Backend Configuration Terraform’s init command is the essential first step to…

Karpenter: Streamlining Kubernetes Scaling & Cost Efficiency

Tamjidul Islam

Cloud Engineer @bkash | DevOps | DevSecOps | FinTech | AWS | Kubernetes | Terraform | Python | Automation | CI/CD

领英推荐

Tamjidul Islam的更多文章

社区洞察

其他会员也浏览了

Cloud-Native Application Deployment Made Easy: A 3-Tier Architecture on AWS EKS with HELM.

Scaling Without Limits: AWS Lambda’s Role in Auto-Scaling Modern Applications

Ensuring High Availability with Multi-Region Deployment on AWS

Exploring Containers and AWS Container Services

Top 10 Features of AWS ECS for Managing Docker Containers

Deploying Docker Containers on AWS ECS

Day-48 - Elastic Container Service ????

From On-Premises to AWS: Architecting a Spring Boot Microservices Ecosystem in the Cloud

Streamlining AWS EC2 Setup with Terraform: A Beginner’s Guide: Part-2

领英推荐

Tamjidul Islam的更多文章

terraform init -backend-config

社区洞察

其他会员也浏览了

Cloud-Native Application Deployment Made Easy: A 3-Tier Architecture on AWS EKS with HELM.

Scaling Without Limits: AWS Lambda’s Role in Auto-Scaling Modern Applications

Ensuring High Availability with Multi-Region Deployment on AWS

Exploring Containers and AWS Container Services

Top 10 Features of AWS ECS for Managing Docker Containers

Deploying Docker Containers on AWS ECS

Day-48 - Elastic Container Service ????

From On-Premises to AWS: Architecting a Spring Boot Microservices Ecosystem in the Cloud

Streamlining AWS EC2 Setup with Terraform: A Beginner’s Guide: Part-2