Practical Approach to Design Non-blocking, High-Performance Computing (HPC) Infrastructure for AI Workloads Clusters

Practical Approach to Design Non-blocking, High-Performance Computing (HPC) Infrastructure for AI Workloads Clusters

Solution:

Designing a non-blocking, high-performance computing (HPC) infrastructure for AI workloads involves careful planning and selection of components to ensure that data flows seamlessly between compute resources, storage, and network without any bottlenecks.

We need to design a 16-GPU AI training cluster that provides high-performance computing (HPC) and low-latency communication between nodes.

Suppose we have compute nodes, each equipped with 2 GPUs and 8x 25G ports for data transfer. Therefore, a total of 8 nodes will be required to meet the 16-GPU requirement.

?To fulfil the technical requirements, 8 leaf switches (each with 32x 25G downlink ports and 2x 400G uplink ports) are necessary. Currently, 8x 25G ports are allocated for server connectivity, while 2x 400G uplink ports will be used for spine connectivity.

?Additionally, 2 spine switches will be needed to connect the leaf switches. This configuration will utilize a VXLAN-EVPN fabric in a spine-leaf architecture as shown in below figure:


Non-Blocking Fabric to Provide 16x GPU to GPU connectivity

?

?In the above figure, each GPU node (server) is connected to a leaf switch via a 25G port. Let's evaluate the capacity of this fabric to ensure it meets the AI cluster's requirements for GPU-to-GPU connectivity:

  • Total Leaf Switches: 8, each equipped with 32x 25G ports. Assuming each leaf switch is fully populated, the total downlink bandwidth would be 32x 25G = 800G.
  • Total Spine Switches: 2, with each leaf switch having 2x 400G uplinks, resulting in a total uplink bandwidth of 2x 400G = 800G.
  • Oversubscription Ratio: Uplink bandwidth to downlink bandwidth is 800G/800G, which gives us a 1:1 oversubscription ratio.
  • Total GPU Nodes: 8, each with 8x 25G ports and 2 GPUs, resulting in a total of 16 GPUs.

Assuming the fabric is fully populated with each Node is equipped with 2xGPU, it can support up to 512 GPUs with a 1:1 oversubscription ratio, ensuring a non-blocking fabric that meets the required performance for GPU-to-GPU connectivity.

Disclaimer: Above example is vendor agnostic.

Author: Altaf Ahmad?

要查看或添加评论,请登录

Altaf Ahmad的更多文章

社区洞察

其他会员也浏览了