SmartAB? Wisdom #52: AI In Supply Chain Management (SCM) – Part 2: Hardware To The Rescue...
Ask Copilot about NVIDIA’s AI chips, and it will tell you a lot about present-day AI hardware. In particular, it will say:
“Nvidia is at the forefront of AI innovation, offering a range of solutions and partnerships to advance the field. Here are some highlights:?
Generative AI Solutions: NVIDIA provides a comprehensive platform for generative AI, which is used across various industries like automotive, healthcare, telecommunications, and more. Their platform supports the development and deployment of AI applications at scale…?
Educational Partnerships: NVIDIA has partnered with California to train 100,000 residents in AI, aiming to create new career pathways and meet industry needs.?
Space Exploration: NVIDIA’s AI technology is set to be launched into space on a SpaceX mission. This mission will test the operation of an AI-capable GPU in orbit. NVIDIA’s contributions to AI are vast and impactful, driving advancements in technology and education.
GPU Technology: NVIDIA’s Graphics Processing Units (GPUs) are specifically designed to handle the parallel processing required for AI tasks. This makes them more efficient than traditional CPUs for training and running AI models…
Generative AI Platform: NVIDIA offers one of the most advanced platforms for generative AI, which is continuously updated to support enterprise-level applications. This platform allows for the deployment of AI applications at scale…
AI Ecosystem: NVIDIA has built a comprehensive ecosystem that includes hardware, software, and services. This ecosystem supports a wide range of AI applications, from autonomous vehicles to healthcare and beyond…
Energy Efficiency: NVIDIA’s GPUs are designed to perform technical calculations faster and with greater energy efficiency than CPUs, making them ideal for AI training and inference.”
And there is no shortage of articles, blogs, and posts describing Nvidia’s hardware in more detail. Take The Verge for example as it says:
“Nvidia’s must-have H100 AI chip made it a multitrillion-dollar company, one that may be worth more than Alphabet and Amazon, and competitors have been fighting to catch up. But perhaps Nvidia is about to extend its lead — with the new Blackwell B200 GPU and GB200 “super chip.”
Nvidia says the new B200 GPU offers up to 20 petaflops of FP4 horsepower from its 208 billion transistors. Also, it says, a GB200 that combines two of those GPUs with a single Grace CPU can offer 30 times the performance for LLM inference workloads while also potentially being substantially more efficient. It “reduces cost and energy consumption by up to 25x” over an H100, says Nvidia, though there’s a question mark around cost — Nvidia’s CEO has suggested each GPU might cost between $30,000 and $40,000.
On a GPT-3 LLM benchmark with 175 billion parameters, Nvidia says the GB200 has a somewhat more modest seven times the performance of an H100, and Nvidia says it offers four times the training speed.”
Nvidia is counting on companies to buy large quantities of these GPUs, of course, and is packaging them in larger designs, like the GB200 NVL72, which plugs 36 CPUs and 72 GPUs into a single liquid-cooled rack for a total of 720 petaflops of AI training performance or 1,440 petaflops (aka 1.4 exaflops) of inference. It has nearly two miles of cables inside, with 5,000 individual cables.?
Each tray in the rack contains either two GB200 chips or two NVLink switches, with 18 of the former and nine of the latter per rack. In total, Nvidia says one of these racks can support a 27-trillion parameter model. GPT-4 is rumored to be around a 1.7-trillion parameter model.
?The company says Amazon, Google, Microsoft, and Oracle are all already planning to offer the NVL72 racks in their cloud service offerings, though it’s not clear how many they’re buying.
?And of course, Nvidia is happy to offer companies the rest of the solution, too. Here’s the DGX Superpod for DGX GB200, which combines eight systems in one for a total of 288 CPUs, 576 GPUs, 240TB of memory, and 11.5 exaflops of FP4 computing.
?Nvidia says its systems can scale to tens of thousands of the GB200 superchips, connected together with 800Gbps networking with its new Quantum-X800 InfiniBand (for up to 144 connections) or Spectrum-X800 ethernet (for up to 64 connections).
?Training a 1.8 trillion parameter model would have previously taken 8,000 Hopper GPUs and 15 megawatts of power, Nvidia claims. Today, Nvidia’s CEO says 2,000 Blackwell GPUs can do it while consuming just four megawatts.”
?It all sounds truly impressive, especially as I recall my own “AI acceleration” efforts 30+ years ago… It was all happening at the time Intel 386 computers were… all the rage!
So, yes, unless you have your own power generation station nearby, leave the heavy-duty AI training exercises to the cloud computing mafia. And remember, joining their cloud-computing parties, you wear long pants, not shorts…
Neural Computing in the 1990s…
Many AI enthusiasts at such time understood the computing limitations quite well. For example, Wikipedia will gladly tell you that: “Robert Hecht-Nielsen was an American computer scientist, neuroscientist, entrepreneur, and professor of electrical and computer engineering at the University of California, San Diego. He co-founded HNC Software Inc. (NASDAQ: HNCS) in 1986 which went on to develop the pervasive card fraud detection system, Falcon?. He became a vice president of R&D at Fair Isaac Corporation when it acquired the company in 2002.”?
Recently, Neil Bloom covered the history of HNC and reminded us all of the following: “In 1989, a young scientist named Krishna Gopinathan was working in symbolic computation at the University of Waterloo in Canada when he came across a book about the concepts of neural networks that he felt were visionary and far-reaching. That quickly led him to jump into the neural networks field.
?“Fascinated and intrigued by neural networks, I started researching companies and universities focusing on neural networks and found an article in Inc. Magazine about Hecht-Nielsen Neurocomputer Corporation, as HNC was known back then. It talked about how the company was applying neural networks to image processing, optical character recognition, medical diagnosis, and multiple other fields, including financial services.”
Gopinathan continued. “I sent my resume to the company, and after a few phone interviews and a wonderful trip to San Diego with my wife, I was asked to join Hecht-Nielsen Neurocomputer Corporation as a staff scientist (in 1990).”?
That turned out to be a win-win-win for HNC and Gopinathan, as well as for San Diego’s data analytics community, which has blossomed tremendously since HNC’s pioneering efforts. In fact, HNC fueled dozens of local entrepreneurs and companies, many of whom are still working in the analytics space today.?
The pair eventually grew HNC to over 1,200 employees and expanded into international markets. Hecht-Nielsen, who became a vice president of R&D at Fair Isaac Corporation when it acquired HNC in 2002 for about $810 million, is remembered as an influential neuroscientist and entrepreneur who set the stage for today’s data science/analytics industry in San Diego. In fact, many key HNC employees went on to launch their own analytics startups.”
From Selling Hardware To Building Applications
?Initially, HNC was quite ready to sell its neural network development software and computational hardware that would overcome the i386 limitations. This is how HNC’s floating point ANZA boards were born, and I was a proud early adopter of the ANZA computational platform at NCR…
?Yet as the company grew, it became clear to HNC that helping others build thousands of different AI applications was highly unfocused and detrimental to its exponential growth.
As Neil Bloom adds: “In 1995, HNC had gone public at approximately $50 million in revenue based on the dominant market share achieved by its core product Falcon, a first-of-its-kind product that used propagated neural network models to detect fraud in credit card transactions. The product was developed by a team of talented scientists, led by Krishna Gopinathan and Anu Pathria.?
One of the key elements that enabled the Falcon product is that HNC went to all of the major credit card issuers and got them to contribute data to a ‘consortium model,’ which gave the technology across credit card issuer view of transactions and made the product much more accurate. In the early days of enterprise analytics and pattern recognition technologies, there was a high spirit of industry cooperation.
?That cooperation was fostered by HNC’s late Co-Founder Robert Hecht-Nielsen, who has been described as a larger-than-life scientist and entrepreneur who had a profound impact on the global analytics scene. The former UC San Diego professor and his late Co-Founder Todd Gutschow are also credited with giving birth to San Diego’s current data science community.
?“HNC was a center for the growth of analytics capabilities in San Diego. Through the ebb and flow generated by its own acquisition, growth, and spin-off cycles, HNC magnetized the San Diego analytics community, infusing it with the entrepreneurial energy that fueled its expansion, attracted new talent, and helped bring analytics to its current prominence on the national stage.”
Under John Mutch’s HNC leadership included Onyx, an ASP provider of credit and fraud checks; Systems/Links, which focused on fraud detection related to telecommunications using real-time call data; and Blaze Advisor, a business rules management system (BRMS) for managing and deploying business logic.
?One of Mutch’s contributions to HNC was helping Robert Hecht-Nielsen and the board think around branding “pattern recognition” and a vertical industry strategy. “The acquisition of Risk Data and the Retek began the march to expand the company based on a vertical market strategy – the goal was to find companies in the industry where the ability to recognize patterns in transaction data could improve the company’s business results and we simultaneously expanded our R&D capability to try to develop effective models outside of the core financial industry,” he said.
?Shortly after, HNC grew to $250 million in revenue and hired close to 1,000 full-time employees across three or four geographies in the US. “Retek was the growth star of the portfolio – they were the closest vertical business unit to a pure-play software company, and we made a decision to spin them out in an IPO,” Mutch said.?
“We launched this into a frothy market in 2001 and the IPO got a $3 billion dollar value. The spinout took a company valued at $800 million based on $250 million in revenue and turned it into a company worth $3 billion. In August 2002, Mutch helped lead the sale of HNC to Fair Isaac & Co. (FICO) for $810 million.”
HNC’s ANZA Was Not The Only Hardware Option…
“SAIC Delta Floating Point Coprocessor board assembled as intended to go into a PC. This system design was conceived by Works and used a pair of bipolar BIT Inc. processors to implement an efficient stream processor for artificial neural networks. It could pull an instruction, and two operands on every clock cycle to do multiply-and-accumulate or other microcode matrix operations at a 22MFlop rate. This was patented.?
Unfortunately, it was very heavy, power-hungry, hot, and expensive, and it did not have a long product life. It was eclipsed by Sky Computers and Mercury Systems DSP offerings before it got into meaningful production”…
First Neural Chips, Anyone?
According to Wikipedia, “ETANN (Electronically Trainable Analog Neural Network) was one of the first commercial neural processors, introduced by Intel around 1989. Implemented on a 1.0 μm process, this chip incorporated 64 analog neurons and 10,240 analog synapses. The ETANN is also the first commercial analog neural processor and is considered to be the first successful commercial neural network chip.
The ETANN was originally announced at the 1989 International Joint Conference on Neural Networks (IJCNN). The chip was implemented using an analog non-volatile floating gate technology on Intel's CHMOS-III 1μm non-volatile memory technology. The chip integrates a total of 64 analog neurons and 1024 analog non-volatile synapses.
The network calculated the dot product between the 64x64 non-volatile EEPROM analog synaptic weight array and a 64-element analog input vector. The chip was reported the calculations to reach 2000 MCPs (million connections per second).
The chip has two synapse weight arrays, both consisting of 4096 normal weights and 1024 (64*16) bias weights. There are 16 bias weights per neuron, allowing for sufficient influence in the sum. With two synapse weight arrays, it's possible to make two layers of 64x64.
The output of the 64 neurons in the first layer are stored in buffers, which are then fed as inputs to the second synapse array. The same neurons are used again for the second layer. Despite being described as having 64 neurons, effectively, this chip has 128 neurons. It's worth pointing out that weights are stored in EEPROMs, meaning Intel was able to eliminate the refresh circuitry that would otherwise be required to retain the weight values, wasting precious die area.
There are some pretty big disadvantages with the use of EEPROM as well such as longer update time (100s of microseconds). It also means that the chip is not particularly suitable for applications that require frequency reprogramming. There is also a practical number of times that the weighted values can be programmed before degenerations is observed.
领英推荐
Once the learning phase is complete and the weights are constants (i.e., no EEPROM updating), Intel claimed the chip is capable of reaching 2,000 MCPs or million connections per second.”
Noticeable developments around ETANN included: “The Naval Air Warfare Center Weapons Division designing and developing a real-time neural processor for missile seeker applications. The system uses a high-speed digital computer as the user interface and as a monitor for processing.?
The use of a standard digital computer as the user interface allows the user to develop the process in whatever programming environment desired. With the capability to store up to 64 k of output data on each frame, it is possible to process two-dimensional image data in excess of video rates.
The real-time communication bus, with user-defined interconnect structures, enables the system to solve a wide variety of problems. The system is best suited to perform local area processing on two-dimensional images.
Using this system, each layer has the capacity to represent up to 65536 neurons. The fully operational system may contain up to 12 of these layers, giving the total system a capacity in excess of 745000 neurons.”
I experimented with ETANN right at the time it was released. Yet, the developers' platform around the chip was poorly supported by Intel. The i486 and other cash cows overshadowed its CPU brand, and the neural chip slowly slipped into oblivion…
?Was it a strategic mistake by Intel? In hindsight, it certainly seems that way by looking at Seeking Alpha’s charts… Especially, when you realize that the company’s market cap dropped by almost 60% from 5 years ago…
Will Intel Finally Catch A Break?
But not all is lost. Did Intel finally receive the AI memo? Well, according to Top500.org, “Intel announced they would be releasing the company’s first AI processor before the end of 2017. The new chip formally codenamed “Lake Crest,” will be officially known as the Nervana Neural Network Processor, or NNP, for short.
As implied by its name, the chip will use technology from Nervana, an AI startup Intel acquired for more than $350 million last year. Unlike GPUs or FPGAs, NNP is a custom-built coprocessor aimed specifically at deep learning, that is, processing the neural networks upon which these applications are based.
In that sense, Intel’s NNP is much like Google’s Tensor Processing Unit (TPU), a custom-built chip the search giant developed to handle much its own deep learning work. Google is already using the second generation of its TPU, which supposedly delivers 180 teraflops of deep learning performance, and is used for both neural net training and inferencing.
In fact, Intel seems to be aiming the NNP at hyperscale companies and other businesses that don’t have access to Google’s proprietary TPU technology. Of course, Intel is also looking to dislodge NVIDIA GPUs from their dominating perch in the AI-accelerated datacenter.
NVIDIA’s latest offering for this market is the V100 GPU, a chip that can deliver 120 teraflops of deep learning. Microsoft, Tencent, Baidu, and Alibaba have all indicated interest in deploying these NVIDIA accelerators in their respective clouds.
To be taken seriously, the NNP needs to exhibit performance in the neighborhood of the V100 and second-generation TPU, and ideally exceed both of them. It also needs to demonstrate that deep learning codes will be able to effectively extract this performance from the NNP chip, and do so with multi-processor setups.”
Recently, Intel announced that “Intel Labs has established the Intel Neuromorphic Research Community (INRC), a global collaborative research effort that brings together teams from academic groups, government labs, research institutions, and companies to overcome the wide-ranging challenges in the field of neuromorphic computing. Together with an ecosystem of leading researchers, Intel is working to pioneer the frontier of brain-inspired AI, progressing this technology from research prototypes to industry-leading products over the coming years. Membership is free and open to all qualified groups.
Recent breakthroughs in AI have swelled our appetite for intelligence in computing devices at all scales and form factors. This new intelligence ranges from recommendation systems, automated call centers, and gaming systems in the data center to autonomous vehicles and robots to more intuitive and predictive interfacing with our personal computing devices to smart city and road infrastructure that immediately responds to emergencies.
?Meanwhile, as today’s AI technology matures, a clear view of its limitations is emerging. While deep neural networks (DNNs) demonstrate a near-limitless capacity to scale to solve large problems, these gains come at a very high price in computational power and pre-collected data.
Many emerging AI applications—especially those that must operate in unpredictable real-world environments with power, latency, and data constraints—require fundamentally new approaches.
Neuromorphic computing represents a fundamental rethinking of computer architecture at the transistor level, inspired by the form and function of the brain’s biological neural networks. Despite many decades of progress in computing, biological neural circuits remain unrivaled in their ability to intelligently process, respond to, and learn from real-world data at microwatt power levels and millisecond response times.
Guided by the principles of biological neural computation, neuromorphic computing intentionally departs from the familiar algorithms and programming abstractions of conventional computing so it can unlock orders of magnitude gains in efficiency and performance compared to conventional architectures.
The goal is to discover a computer architecture that is inherently suited for the full breadth of intelligent information processing that living brains effortlessly support. Intel Labs is pioneering research that drives the evolution of computing and algorithms toward next-generation AI. In 2018, Intel Labs launched the Intel Neuromorphic Research Community (Intel NRC) and released the Loihi research processor for external use.
The Loihi chip represented a milestone in the neuromorphic research field. It incorporated self-learning capabilities, novel neuron models, asynchronous spike-based communication, and many other properties inspired from neuroscience modeling, with leading silicon integration scale and circuit speeds.?
Over the past three years, Intel NRC members have evaluated Loihi in a wide range of application demonstrations. Some examples include:
?? Adaptive robot arm control
? Visual-tactile sensory perception
? Learning and recognizing new odors and gestures
? Drone motor control with state-of-the-art latency
in response to visual input
? Fast database similarity search
? Modeling diffusion processes for scientific
computing applications
? Solving hard optimization problems such as
railway scheduling
Building on the insights gained from the research performed on the Loihi chip, Intel Labs introduces Loihi 2. A complete tour of the new features, optimizations, and innovations of this chip is provided in the final section.”
Techopedia reports that “Since OpenAI launched its generative AI chatbot, ChatGPT, in 2022 we have already witnessed many wonders from artificial intelligence.
However, if AI is to continue expanding its capability boundaries, it will require not only a vast amount of data but also advanced chips capable of powering its computational demands.
From training colossal neural networks to running intricate image recognition, natural language processing, voice recognition, machine translation, and autonomous systems, AI chips are the brawn behind the AI brain.
As this field explodes, many companies known for making the more traditional central processing units (CPUs) have shifted their focus to specialized processors for AI workloads — while new ones are springing up.
Given the rush for these specialized processors, the market has gained a revenue of $15.9 billion in 2022 and is projected to reach $207 Billion by 2030, according to MarketDigits.
Right now, Nvidia dominates the market with a 95% market share, while newcomers like SambaNova are showing rapid growth. As competition increases, expect constant innovation across the industry”
As competition intensifies, we can expect more innovation and improvements in the AI sector as well as more breakout startups from the wild. This progress has the potential to transform diverse industries and enhance the capabilities of AI applications. The innovations in AI chip designs are essential for a wide array of applications, data centers, driving progress (car manufacturing), and shaping the future of computing.”
For More Information
Please see my other posts on Linkedin, Twitter, Substack, and CGE’s website.
AI Boogeyman
?You can also find additional info in my hardcover and paperback books published on Amazon: “AI Boogeyman – Dispelling Fake News About Job Losses” and on our YouTube Studio channel…
A Radically Innovative Advisory Board Subscription Services
Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer
3 个月The convergence of AI and neuroscience promises unprecedented leaps in understanding consciousness and cognition. Recent breakthroughs in neuromorphic computing, like IBM's TrueNorth chip, are paving the way for AI systems that learn and adapt more like the human brain. With the advent of fully-integrated brain-computer interfaces, how will we define the boundaries between human intelligence and artificial intelligence?