SuperNIC Explained? Part 2
Scott Schweitzer (c) 2024

SuperNIC Explained? Part 2

Earlier this summer, in Part 1, I speculated on NVIDIA's definition of a SuperNIC. On Friday, I received an email newsletter from NVIDIA Networking pointing me to a November blog post by Itay Ozery titled "What is a SuperNIC?" I'm not sure how this was missed while researching that piece, and I'm sorry.

"Get to Know the SuperNIC" email by NVIDIA on July 19, 2024

NVIDIA defines a SuperNIC as different from a DPU in the following ways six ways:

1. 400 Gbps using RDMA over RoCE

AI is simply another High-Performance Computing (HPC) workload. The only differences are the typical packet payload sizes and how latency-sensitive these applications are. For the past three decades, we've been tuning HPC networking fabrics to squeeze out every bit of networking latency while improving overall throughput and performance. The design for the car you drove to work this morning was likely crashed hundreds of times in HPC simulations to improve your overall safety and comfort. Two decades ago, Ford and GM each ran several HPC clusters that performed finite particle analysis for this purpose. At the same time, Boeing and Catapillar had Computational Fluid Dynamics executing on their HPC clusters doing wing and engine design. RDMA is well over two decades old, and RoCE, while newer, was around before Taylor Swift released her Red album (2012) for the first time. The only thing even remotely new is the speed, and two years ago, NVIDIA revealed these capabilities at this speed in their ConnectX-7 at ISC-2022.

2. High-speed packet reordering

High-speed packet reordering has been an HPC requirement since the beginning. Myrinet-2000, the leading HPC networking fabric in 2003, was doing packet reordering long before Infiniband came onto the scene a few years later. Casting this as something "new" requires something new, and what that is isn't obvious at this point.

3. Advanced congestion control using real-time telemetry data

Again, this has been around in one form or another in HPC networking since at least Myrinet-2000. Back then, every NIC in the HPC network knew about every other NIC and all the possible paths between them. Back then, one NIC was dynamically elected the master in the fabric, and it then updated the node tables in all the other NICs. The NICs then each used this real-time data to route packets between compute nodes over one or more multiple paths simultaneously for a single message.

4. Programmable computing on the I/O path

This could be the single best "new" feature, but the description given by NVIDIA is nebulous, "to enable customization and extensibility of network infrastructure in AI cloud data centers." If the SuperNIC is doing some programmable transformation on the data to improve overall performance and throughput, it would be worth earning the monicker SuperNIC.

5. Power-efficient design to meet AI power budgets

All semiconductor companies strive to limit the power our chips consume, but the fact remains that more transistors typically translate to greater power consumption. Back in the day, standard NICs drew 15W, and SmartNICs drew anywhere from 35 to 75W, which was the power limit of the gold fingers of the PCIe x16 connector. More recently, DPUs have required a supplemental VGA 6-pin power supply connector providing an additional 75W. The "SuperNIC" version of the BlueField-3 has an optional VGA 8-pin power connector capable of supplying up to 150W of additional power. As stated in the "NVIDIA BlueField-3 DPU Controller User Manual" the SuperNIC version has a "... maximum power consumption does not exceed 150W."

6. Full-stack AI optimization

This defining feature, as outlined here: "Full-stack AI optimization, including compute, networking, storage, system software, communication libraries, and application frameworks," lacks detail. The rest of the blog post doesn't provide specific optimization details or examples.

Given what has been published thus far, it appears that a SuperNIC may just be a DPU leveraging RoCE.

Functionally, all of these things (SuperNICs, SmartNICS, DPUs) are the same thing. It’s all about the actual solutions that are being delivered with this architecture. So if Nvidia delivers a DPU solution that cuts CPU/GPU utilization by 50%, everyone will be calling their solution a GPU. If AMD/Pensando squashes IO calls by 50%, people will ask to buy SmartNICs. Etc.

Aldrin Isaac

Director, Site Network Engineering at eBay

4 个月

SmartNICs don't always make sense at scale. They add cost, including recurring power costs. Features like high speed packet reordering should not require a SmartNIC. I don't see any reason why high speed packet reordering for RDMA can't be implemented in a standard NIC, other than as an excuse to force customers to pay more for a fancy SmartNIC. All that is needed is standard NICs with high speed packet reordering support for RDMA over a simple packet spray fat-tree fabric that can fast-fail on link issues. In such a case, advanced real-time telemetry is only needed for diagnostics, not for influencing the data path characteristics.

Aidan Herbert

Decentralized transactional ecosystem enabler

4 个月

It's odd that more organizations do not use FPGA-enabled SmartNICs for line speed packet filtering. So much more efficient tha=n shipping all packets through Cloudflare for DDoS protection!

Great content. Excellent analysis!

要查看或添加评论,请登录

社区洞察

其他会员也浏览了