登录查看更多内容

How to understand to Infini(Band)ty and Beyond

Fancy Wang

Helping Global Enterprises Optimize Network Performance | Ethernet Card & Switch Solutions

发布日期: 2021年4月26日

+ 关注

Fancy Wang 0426 2021

The following part is from Gilad Shainer, Sr. Vice President of Marketing

The heart of a data center is the network that connects all the compute and storage elements together. In order to get these elements working together and form a supercomputer (for research, cloud or deep learning), the network must be highly efficient and extremely fast. InfiniBand is an industry standard technology that was (and continues to be) developed with the vision of forming a highly scalable, pure software-defined network (SDN). Back in 2003, it connected one of the top three supercomputers in the world. The June 2020 TOP500 supercomputing list stated that InfiniBand connects seven of the top ten supercomputers in the world. InfiniBand is strongly adopted for deep learning infrastructures, and is increasingly being used for hyperscale cloud data centers such as Microsoft Azure and others. The performance, scalability, and efficiency advantages of InfiniBand continue to drive the growing and strong adoption of InfiniBand, as it is the ideal technology for compute and data intensive applications.

InfiniBand provides key advantages: It is a full-transport offload network, which means that all the network operations are managed by the network and not by the CPU or the GPU; it enables the most efficient data traffic, which means that more data gets transported with less overhead; it is the only 200 gigabit-per-second high-performance end-to-end network today; it has the lowest latency compared to any other standard or proprietary network; and most importantly, it incorporates data processing engines inside the network that accelerate data processing for deep learning and high-performance computing.

The answer to why InfiniBand presents these advantages and continuously maintains a one-generation-ahead technology leadership can be found in the four main InfiniBand technology fundamentals:

A very smart endpoint – an endpoint that can execute and manage all of the network functions (unlike Ethernet or proprietary networks), and can therefore increase the CPU or GPU time that can be dedicated for the real applications. Since the endpoint is located near CPU/GPU memory, it can also manage memory operations in a very effective and efficient way—for example, via RDMA or GPUDirect RDMA / storage.

A switch network that is designed for scale – it is a pure software-defined network (SDN). InfiniBand switches, for example, do not require an embedded server within every switch appliance for managing the switch and for running its operating system (as needed in the case of other networks). This makes InfiniBand a leading cost-performance network fabric compared to Ethernet or any proprietary network out there. It also enables unique technology innovations such as In-Network Computing, which means that data calculations get performed on the data as it is being transferred in the network. An important example is the Scalable Hierarchical Aggregation and Reduction Protocol (SHARP)? technology, which has demonstrated great performance improvements for scientific and deep learning application frameworks.

Centralized management – you can manage, control and operate the InfiniBand network from a single place. You can also design and build any sort of network topology and customize and optimize the data center network for its target applications. There is no need to create multiple and different switch boxes for the different parts of the network, and there is no need to deal with so many complex network algorithms. The philosophy behind InfiniBand technology is to improve performance on the one side and to reduce OPEX on the other.

Last but not least, InfiniBand is a standard technology ensuring backward and forward compatibility, and is open source with open APIs. So by carrying software from one generation to the next, you protect investments. And unlike proprietary networks that require to invent the same wheel over and over again, InfiniBand enjoys the support of a large software eco-system and rich set of software frameworks.

InfiniBand-connected data centers can of course be easily connected to external Ethernet networks via InfiniBand-to-Ethernet (200 nano-second) low-latency gateways. InfiniBand also offers long-reach connectivity from tens-to-thousands of miles, enabling remote data centers to connect to each other.

The InfiniBand Trade Association (IBTA) has just released an update to the InfiniBand roadmap, calling out the future generations of InfiniBand illustrated in Figure 1.

A typical InfiniBand adapter or switch port includes 4 differential serial pairs, also referred to as an InfiniBand 4X port. The latest InfiniBand roadmap specifies NDR 400 gigabit per second (Gb/s) for an InfiniBand 4X port as the next speed, followed by XDR 800Gb/s, and then GDR 1.6 terabit per second (1600Gb/s). This roadmap is the most aggressive interconnect roadmap in the industry, targeting to sustain the generation-ahead advantage, and to provide the needed data speeds for the future compute and data-intensive applications.

A key technology enabling high supercomputing performance and scalability is In-Network Computing. Engines of In-Network Computing refer to pre-configured or programmable computing engines located on the datapath of network adapters or switches. These engines can process data or perform pre-defined algorithmic tasks on the data as it is being transferred within the network. Two examples of such engines are InfiniBand hardware MPI tag matching and InfiniBand Scalable Hierarchical Aggregation and Reduction Protocol (SHARP).

SHARP has been described in multiple earlier publications, including recently at ISC’20 in a paper titled “Scalable Hierarchical Aggregation and Reduction Protocol (SHARP) Streaming-Aggregation Hardware Design and Evaluation” by Richard L. Graham, Lion Levi, Devendar Burredy, Gil Bloch, Gilad Shainer, David Cho, George Elias, Daniel Klein, Joshua Ladd, Ophir Maor, Ami Marelli, Valentin Petrov, Evyatar Romlet, Yong Qin, and Ido Zemah.

The InfiniBand hardware MPI tag matching technology is illustrated in Figure 2.

The Message Passing Interface (MPI) standard allows for matching messages to be received based on tags embedded in the message. Processing every message to evaluate whether its tags match the conditions of interest can be time consuming and wasteful.

MPI send/receive operations require matching source and destination message parameters to deliver data to the correct destination. The order of matching must follow the order in which sends and receives are posted. The key challenges for providing efficient tag-matching support include managing the metadata needed for tag matching, temporary copies of data to minimize the latency between tag-matching and data delivery, keeping track of posted receives that have not been matched, unexpected message arrivals, and overlapping tag-matching and the associated data delivery with on-going computation.

Support for asynchronous hardware-based tag matching and data delivery is provided by HDR InfiniBand ConnectX-6 network adapters and beyond. Network hardware-based tag matching reduces the latency of multiple MPI operations while also increasing overlap between MPI compute and communication, as shown in Figure 3.

The Ohio State University MVAPICH team has demonstrated a 1.8X performance increase with InfiniBand hardware tag matching. The team has also demonstrated a 1.4X performance increase for 3DStencil applications at 128 nodes on the Texas Advance Compute Center Frontera supercomputer.

The suite of InfiniBand In-Network Computing engines does not exist in any other network, whether it is the long-present Ethernet or proprietary networks such as Omnipath, Aries, or Slingshot (referred to as “HPC Ethernet” for marketing purposes). So while InfiniBand delivers many advantages such as high data throughput, extremely low latency, and advanced adaptive routing and congestion control mechanisms, it is InfiniBand’s In-Network Computing technology – which transforms the InfiniBand network into a data processing unit – that supplies the main reason for the growing usage of InfiniBand in supercomputing, deep learning and large scale cloud platforms.

Shenzhen 10Gigabit ethernet Technology is a company With 14 years of supply chain resources in Shenzhen, FancyWang provides one-stop service of network equipment for small IT project companies around the world. At present, the main focus is on 10G network equipment and For peripheral products, please contact us.

Stanley Russel

1 年

Fancy Wang You talked about InfiniBand's remarkable capabilities, especially in the context of supercomputing and data-intensive applications. Given the rapid advancements in AI and machine learning, how do you see InfiniBand evolving to address the unique networking demands of future quantum computing applications? Quantum computing presents distinct challenges and opportunities, and I'm curious about your insights on how InfiniBand might play a role in this emerging field.

查看更多评论

要查看或添加评论，请登录

Fancy Wang的更多文章

Transforming Clinic Data Management – Detailed Insights and Opportunities for Network Equipment Providers

2025年3月20日

Transforming Clinic Data Management – Detailed Insights and Opportunities for Network Equipment Providers

Fancy Wang Introduction Clinics globally are facing unprecedented challenges in managing the enormous volume of data…
Transforming Dental Clinic IT Infrastructure for the Digital Era

2025年3月18日

Transforming Dental Clinic IT Infrastructure for the Digital Era

Fancy Wang Introduction Traditional dental clinics are often burdened by outdated IT infrastructures that can…
Generative AI in the Telecom Industry: Transforming Customer Service, Network Optimization, and Maintenance

2025年3月7日

Generative AI in the Telecom Industry: Transforming Customer Service, Network Optimization, and Maintenance

Fancy Wang Generative AI technologies such as ChatGPT and Google Gemini are revolutionizing the telecom sector. By…
The Impact of Increased Tariffs on Telecom Equipment and Supply Chains: Key Considerations for Operators

2025年3月6日

The Impact of Increased Tariffs on Telecom Equipment and Supply Chains: Key Considerations for Operators

Fancy Wang In recent months, the U.S.
5G and Private Networks: Driving B2B Growth in the New Digital Era

2025年3月4日

5G and Private Networks: Driving B2B Growth in the New Digital Era

Fancy Wang As enterprise demands for ultra-low latency networks, robust edge computing, and secure connectivity surge…

1 条评论
Smart Dental Floss Dispensers with IoT Integration: Transforming Oral Care in 2025

2025年2月28日

Smart Dental Floss Dispensers with IoT Integration: Transforming Oral Care in 2025

Fancy Wang As digital transformation touches every industry, the dental care sector is no exception. One innovation…
Full-Scale Deployment of 5G and Post-5G Technologies: Implications for China's One-Stop Network Solutions

2025年2月27日

Full-Scale Deployment of 5G and Post-5G Technologies: Implications for China's One-Stop Network Solutions

Fancy Wang The rapid rollout of 5G networks worldwide is set to redefine the telecommunications landscape. By 2025…
Social Selling & Relationship Building on LinkedIn: A Guide for Telecom Solution Providers

2025年2月25日

Social Selling & Relationship Building on LinkedIn: A Guide for Telecom Solution Providers

Fancy Wang In today’s digital-first business environment, social selling has become the cornerstone of successful B2B…

1 条评论
Smart Dental Floss Dispensers and IoT Integration: A Comprehensive Outlook for 2025

2025年2月21日

Smart Dental Floss Dispensers and IoT Integration: A Comprehensive Outlook for 2025

Fancy Wang As digital transformation reshapes healthcare, even the humble dental floss is getting a smart upgrade. In…
Driving Success in Telecom Solutions Through LinkedIn: Global Insights, SOP, Market Data & Technical Case Studies

2025年2月20日

Driving Success in Telecom Solutions Through LinkedIn: Global Insights, SOP, Market Data & Technical Case Studies

Fancy Wang In today’s rapidly evolving telecom landscape, solution providers must leverage digital marketing to stay…

1 条评论

See all articles

How to understand to Infini(Band)ty and Beyond

Fancy Wang

Helping Global Enterprises Optimize Network Performance | Ethernet Card & Switch Solutions

Fancy Wang 0426 2021

The following part is from Gilad Shainer, Sr. Vice President of Marketing

Fancy Wang的更多文章

其他会员也浏览了

Daily Dose of Tech | 2023-12-12

The Race for AI Infrastructure: Is the $500 Billion Investment Boom Building for the Future or a Cobweb Cycle?

DDR5 memory chip

Huawei AI Storage Ranked No. 1 for Performance in 2024 MLPERF? AI Benchmarks

Is AI Really Going To Affect You? Follow the Money.

M2M is shifting to M2M

Subsea Optical Cables: The Backbone of AI, Cloud Advancements, and the Future of Data Centers

Seven Trends To Look in the Storage Industry

Data Center Semiconductor's Best Processors

PAM4 Optical DSP Market: Advancing High-Speed Optical Communication

Fancy Wang 0426 2021

The following part is from Gilad Shainer, Sr. Vice President of Marketing

Fancy Wang的更多文章

Transforming Clinic Data Management – Detailed Insights and Opportunities for Network Equipment Providers

Transforming Dental Clinic IT Infrastructure for the Digital Era

Generative AI in the Telecom Industry: Transforming Customer Service, Network Optimization, and Maintenance

The Impact of Increased Tariffs on Telecom Equipment and Supply Chains: Key Considerations for Operators

5G and Private Networks: Driving B2B Growth in the New Digital Era

Smart Dental Floss Dispensers with IoT Integration: Transforming Oral Care in 2025

Full-Scale Deployment of 5G and Post-5G Technologies: Implications for China's One-Stop Network Solutions

Social Selling & Relationship Building on LinkedIn: A Guide for Telecom Solution Providers

Smart Dental Floss Dispensers and IoT Integration: A Comprehensive Outlook for 2025

Driving Success in Telecom Solutions Through LinkedIn: Global Insights, SOP, Market Data & Technical Case Studies

其他会员也浏览了

Daily Dose of Tech | 2023-12-12

The Race for AI Infrastructure: Is the $500 Billion Investment Boom Building for the Future or a Cobweb Cycle?

DDR5 memory chip

Huawei AI Storage Ranked No. 1 for Performance in 2024 MLPERF? AI Benchmarks

Is AI Really Going To Affect You? Follow the Money.

M2M is shifting to M2M

Subsea Optical Cables: The Backbone of AI, Cloud Advancements, and the Future of Data Centers

Seven Trends To Look in the Storage Industry

Data Center Semiconductor's Best Processors

PAM4 Optical DSP Market: Advancing High-Speed Optical Communication