Topology-Aware Routing in Kubernetes: Improved Efficiency and Lower Costs!

Topology-Aware Routing in Kubernetes: Improved Efficiency and Lower Costs!

Topology-aware routing is a powerful feature in Kubernetes that optimizes internal traffic flows, reduces latency, and minimizes inter-zone data transfer costs.


What is Topology-Aware Routing?

Topology-aware routing (formerly known as topology-aware hints) is a feature that allows Kubernetes services to route traffic based on the topology of the cluster, such as the node’s physical location (e.g., which Availability Zone it belongs to). The goal is to keep traffic within a local zone whenever possible, which minimizes cross-zone traffic costs and latency. This is particularly beneficial for high-availability applications where reducing cost and improving performance are key.



Credits: AWS Official Documentation

How Topology-Aware Routing Works

When a Kubernetes Service is created, multiple EndpointSlices are automatically created. These EndpointSlices hold metadata about endpoints, such as IP addresses and the zones they belong to. Topology-aware routing works by assigning hints to these endpoints, which are then used by kube-proxy to make intelligent routing decisions.

In practical terms, here’s how the routing happens:

  1. EndpointSlices with Hints: The EndpointSlice controller assigns hints to each endpoint, specifying the zone it should serve.
  2. Traffic Direction by kube-proxy: When traffic is routed through kube-proxy, it uses these hints to ensure that most traffic remains within the local zone unless load balancing requires otherwise.


Credits: AWS Official Documentation

This approach is particularly effective for reducing cross-zone data transfer charges, which can accumulate significantly if workloads are distributed across multiple zones.

Configuration Example

To configure a Kubernetes Service with topology-aware routing, you can add an annotation to your Service manifest as follows:

apiVersion: v1
kind: Service
metadata:
  name: api-service
  namespace: backend
  annotations:
    service.kubernetes.io/topology-mode: Auto
spec:
  selector:
    app: api
  type: ClusterIP
  ports:
    - protocol: TCP
      port: 8080
      targetPort: 8080
        

The annotation service.kubernetes.io/topology-mode: Auto enables the Kubernetes EndpointSlice controller to apply hints based on the node topology. This setting ensures that traffic is primarily routed to endpoints within the same zone.

Key Benefits

  1. Reduced Cross-Zone Traffic Costs: Routing traffic within the same zone eliminates cross-zone data transfer costs, a significant advantage in multi-AZ environments.
  2. Lower Latency: By keeping traffic local, the latency involved in cross-zone communication is minimized, enhancing the performance of latency-sensitive applications.
  3. Better Resource Allocation: This routing strategy also ensures efficient use of network resources by minimizing unnecessary data hops.

Best Practices

  • Even Workload Distribution: For the best results, ensure that workloads are evenly distributed across all zones. This helps in maintaining high availability and prevents one zone from being overloaded.
  • Pod Topology Spread Constraints: Use PodTopologySpreadConstraints to distribute pods evenly across zones, which works in tandem with topology-aware routing to ensure efficiency and fault tolerance.
  • Node Affinity and Pod Affinity: To further control traffic locality, consider using node affinity or pod affinity to place related workloads on nodes that will benefit from lower-latency communication.

Considerations and Limitations

  • Service Internal Traffic Policy: Topology-aware routing is not compatible with the Local internal traffic policy, which also aims to restrict traffic to a particular node.
  • Scaling issues: If scaling is not optimized for topology zones property, resource allocation might not happen properly making services unavailable in the availability zone with the least resources, and even though HPA is set since traffic is routed based on topology so it will not trigger scale up as resources in other availability zone.

Use Cases in Production Environments

Topology-aware routing is especially useful in scenarios where cross-zone data transfer is frequent and costly. Some typical use cases include:

  • Microservices Architectures: Where frequent inter-service communication can lead to high costs if services are distributed across multiple zones.
  • Latency-Sensitive Applications: Where keeping network traffic within the local zone is critical for maintaining low latency and high user experience.
  • Hybrid Workloads: Applications deployed across multiple cloud environments or on-premises data centers where cost and performance management are crucial.


Special note

How to Handle Scaling in Topology-Aware Routing Scenario?

Using Topology Spread Constraints:

We can use Topology Spread Constraints in Kubernetes to distribute pods evenly across different topology domains, such as zones or nodes, to ensure reliability and availability. Here's how:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api
  template:
    metadata:
      labels:
        app: api
    spec:
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: ScheduleAnyway
        labelSelector:
          matchLabels:
            app: api
      containers:
      - name: api-container
        image: api-image
        

  • maxSkew: The max difference between the number of pods in different zones.
  • topologyKey: The label used to define the domain, e.g., topology.kubernetes.io/zone.
  • whenUnsatisfiable: Specifies action when constraints cannot be met (DoNotSchedule or ScheduleAnyway).


Conclusion

Topology-aware routing is a valuable feature for optimizing cost and performance in Kubernetes clusters, particularly in AWS EKS environments. By using this feature effectively, you can ensure that your applications are not only highly available but also efficient in their network usage, reducing costs and improving performance.

For Kubernetes administrators managing multi-zone clusters, enabling topology-aware routing can be a key step towards more efficient cloud resource usage.

Mim Ahmed Joy

DevOps Engineer @ Pipeline Co., Ltd | AWS Community Builder | PeopleCert DevOps Ambassadors

1 个月

Most needed. Thanks

Zeeshan Ahmad

Experienced Amazon FBA VA | Looking for Roles in Product Research, PPC, and E-commerce Growth

1 个月

Important

要查看或添加评论,请登录

社区洞察

其他会员也浏览了