Catching Up with HPC-Supercomputers
Neil Raden
Author, Advisor, Mathematician .Thinkers360 Global Thought Leader/Influencer iAI, Analytics, Predictive Analytics, National Security, GenAI, International Relations, Design Thinking, InsurTech, Quantum, and Health Tech
II was the Chairman of the Advisory Board at Sandia Labs for the WIPP project (Nuclear waste storage) and the Nuclear Weapons Stockpile Stewardship program from 1997-1999. With respect to the latter program, we brought up ASCI Red, the first teraFLOP computer, also the first built largely from COTS components. It was employed to do the finite modeling and other codes for the WIPP plant, for example:
Disposal System Geometry, Culebra Hydrogeology, Repository Fluid Flow, Impure Halite Salado Interbeds, Disturbed Rock Zone, Actinide Transport in the Salado, Direct Brine Release, and dozens of others, but mainly to test the efficacy of the existing nuclear warhead stockpile and simulation of new designs. of nuclear weapons.
Today the fastest supercomputers in operation are:
Summit, Oak Ridge, 0.2 ExaFLOPS, 2018
Sierra, Lawrence Livermore, 0.125 exaFlops, 2018
Summit and Sierra are both IBM POWER9 CPUs, Nvidia Tesla V100 GPUs, Mellanox EDR InfiniBand
The next two are all HPE/Cray Shasta Slingshot interconnect, AMD EPYC CPU, Radeon Instinct GPU technology with a 4:1 GPU-to-CPU ratio with high-speed links and coherent memory between them within the node. Totally different architecture and gone are IBM, Intel and NVIDIA CHIPS; Aurora is also a Cray “Shasta†system — but based on a future generation of Intel Xeon Scalable processor, Intel’s Xe compute architecture, a future generation of Intel Optane Datacenter Persistent Memory, and Intel’s One API software.
The only reason I bring this up is that the three new ones are one-to-two million times faster than ASCI Red. Now if we could do something with a teraFLOP, what the hell can they do with a thousand quintillion or two double-precision floating-point calculations per second? I can't imagine the DOE is spending billions on these behemoths for scientific research.
Frontier, Oak Ridge 1.5 ExaFLOPS, 2022, claim for science, not weapons
El Capitan, Lawrence Livermore 2.0 exaFLOPS, 2023: Claim it will be used for nuclear weapons sim, but "Some" scientific work too
Auroura, Argonne Labs 1.0 ExaFlops, 2021, also claim science
Anyway, here is the TOPS 500 list as of June 2020. To make the list of top 500 supercomputers, the entry-level is 1.65 petFlops. Summit and Sierra are #2 and #3 (and why does Italy have two???). The US has 116 on the list, China has 219:
RankSystemCoresRmax (TFlop/s)Rpeak (TFlop/s)Power (kW)1Supercomputer Fugaku - Supercomputer Fugaku, A64FX 48C 2.2GHz, Tofu interconnect D, Fujitsu
RIKEN Center for Computational Science
Japan7,299,072415,530.0513,854.728,3352Summit - IBM Power System AC922, IBM POWER9 22C 3.07GHz, NVIDIA Volta GV100, Dual-rail Mellanox EDR Infiniband, IBM
DOE/SC/Oak Ridge National Laboratory
United States2,414,592148,600.0200,794.910,0963Sierra - IBM Power System AC922, IBM POWER9 22C 3.1GHz, NVIDIA Volta GV100, Dual-rail Mellanox EDR Infiniband, IBM / NVIDIA / Mellanox
United States1,572,48094,640.0125,712.07,4384Sunway TaihuLight - Sunway MPP, Sunway SW26010 260C 1.45GHz, Sunway, NRCPC
National Supercomputing Center in Wuxi
China10,649,60093,014.6125,435.915,3715Tianhe-2A - TH-IVB-FEP Cluster, Intel Xeon E5-2692v2 12C 2.2GHz, TH Express-2, Matrix-2000, NUDT
National Super Computer Center in Guangzhou
China4,981,76061,444.5100,678.718,4826HPC5 - PowerEdge C4140, Xeon Gold 6252 24C 2.1GHz, NVIDIA Tesla V100, Mellanox HDR Infiniband, Dell EMC
Italy669,76035,450.051,720.82,2527Selene - DGX A100 SuperPOD, AMD EPYC 7742 64C 2.25GHz, NVIDIA A100, Mellanox HDR Infiniband, Nvidia
United States272,80027,580.034,568.61,3448Frontera - Dell C6420, Xeon Platinum 8280 28C 2.7GHz, Mellanox InfiniBand HDR, Dell EMC
Texas Advanced Computing Center/Univ. of Texas
United States448,44823,516.438,745.99Marconi-100 - IBM Power System AC922, IBM POWER9 16C 3GHz, Nvidia Volta V100, Dual-rail Mellanox EDR Infiniband, IBM
Italy347,77621,640.029,354.01,47610Piz Daint - Cray XC50, Xeon E5-2690v3 12C 2.6GHz, Aries interconnect , NVIDIA Tesla P100, Cray/HPE
Swiss National Supercomputing Centre (CSCS)
Switzerland387,87221,230.027,154.32,384
Why do we need exascale supercomputers that cost $600M-$1B each, and are obsolete and dismantled in 5-10 years? And that cost does not include the facility to house them, about the size of two basketball courts, with liquid and air cooling systems and something like 30-50MW of power to run, enough to power a few thousand homes, but more importantly, you can't just drop one of these just anywhere. Where are you going to find 30-50MW service? That's the primary reason they scrap these things every few years, to make space for a new one with facilities already in place. Titan was a 27 petaFLOP supercomputer that went into service at Oak Ridge in 2012 and was, at the time, fastest in the world. In 2019 is was disassembled and sold for scrap to make room for Frontier.
Neil Raden nraden@hiredbrains.com
CEO & Principal Analyst
Hired Brains Research Santa Fe, NM
Contributing Analyst Diginomica https://diginomica.com/author/neil-raden
Co-Author "Smart (Enough) Systems," Prentice-Hall
Principal Investigator: Ethical Use of Artificial Intelligence for Actuaries
Onalytica Big Data: Top 100 Influencers
2019 Analytics Insight Top 100 Artificial Intelligence and Big Data Influencers
Chairman Advisory Board, Sandia National Laboratories
LinkedIn: neilraden
Twitter @neilraden
+1 505-982-6397 US MSTt