The AI Semiconductor Landscape Primer

The AI Semiconductor Landscape Primer

To get my best coverage consider becoming a paid subscriber for less than $2 a week. To read this post with infographics and proper formatting please visit the original here.

This was written on January 20th, 2025. Happy Trump inauguration day! With the U.S. continuing a number of stringent exports controls and this next administration expected to keep it up maybe even with elevated tariffs it’s a super interesting time to think more about the semiconductor industry. The AI arms race and national security due diligence related to U.S. exceptionalism is upon us.

A flurry of Executive Orders by the Biden Administration on his last two weeks in office were telling. Trump and the new administration will be carefully watched and their actions scrutinized. Meanwhile, I’ve long admired the work of Eric Flaningam for his macro overviews on various aspects of technology stacks. Let’s feature some of them here:


?? Generative Value ??

Generative Value

Sharing learnings about tech & business.

By Eric Flaningam

His Newsletter, Generative Value provides great insights into how everything is connected.

Articles to check out!


  1. A Deep Dive on Inference Semiconductors
  2. The Current State of AI Markets
  3. A Primer on Data Centers
  4. A Primer on AI Datacenters
  5. The Inference Landscape
  6. Nvidia: Past, Present and Future

Whether you are an investor, technologist or just a casual reader his overviews will provide you some value and are easy to understand and scan.

As you know, the Biden administration has implemented a series of executive orders (EOs) and export controls (ECs) aimed at regulating the semiconductor industry, particularly in response to national security concerns and competition with China. Trump has said various things with regards to tariffs on China as well. The U.S. appears to be trying to control how AI spreads to other nations, limiting China’s ability to for example access the best Nvidia AI related GPUs and chips.

Jake Sullivan — with three days left as White House national security adviser, with wide access to the world's secrets — called on journalists and news media to deliver a chilling, "catastrophic" warning for America and the incoming administration:

The AI Arms Race circa 2025

What happens this point on is fundamentally a new world of innovation and competition in innovation.

“The next few years will determine whether artificial intelligence leads to catastrophe — and whether China or America prevails in the AI arms race.”

  • According to JS, as reported by Axios, “AI development sits outside of government and security clearances, and in the hands of private companies with the power of nation-states.”
  • “U.S. failure to get this right, Sullivan warns, could be "dramatic, and dramatically negative — to include the democratization of extremely powerful and lethal weapons; massive disruption and dislocation of jobs; an avalanche of misinformation." It wasn’t clear in his briefing if OpenAI, Anthropic, Google and others can be expected to “get this right”. The U.S. believes it is the AI leader heading into the new year and new administration.
  • Clearly in 2025, corporations and the financial elite who have the most say (majority shareholders), have enormous power in the AI arms race that’s ahead in the 2025 to 2035 period, an incredible decade of datacenters, semiconductors and a sprawling new landscape related to AI ahead. The 2025-2035 argutely is the most important decade in the history of innovation human civilization has ever witnessed.
  • Geopolitics aside, the semiconductor industry is becoming way more important with datacenters and a new emergence of AI’s capabilities. I will be covering the semiconductor industry more closely in 2025 in this and related publications.

But how does it all work? What are the companies involved? Why are companies like Nvidia, TSMC, ASML and others so pivotal? What about the big picture and landscape?

The AI Semiconductor Landscape

By Eric Flaningam, December, 2024.


Hi, my name’s Eric Flaningam, I’m the author of Generative Value, a technology-focused investment newsletter. My investment philosophy is centered around value. I believe that businesses are valued based on the value they provide to customers, the difference between that value & the value of competitors, and the ability to defend that value over time. I also believe that technology has created some of the best businesses in history and that finding those businesses will lead to strong returns over time. Generative Value is the pursuit of those businesses.

1. Introduction

Nvidia’s rise in the last 2 years will go down as one of the great case studies in technology.

Jensen envisioned accelerated computing back in 2006. As he described at a commencement speech in 2023, ”In 2007, we announced [released] CUDA GPU accelerated computing. Our aspiration was for CUDA to become a programming model that boosts applications from scientific computing and physics simulations, to image processing. Creating a new computing model is incredibly hard and rarely done in history. The CPU computing model has been the standard for 60 years, since the IBM System 360.”

For the next 15 years, Nvidia executed on that vision.

With CUDA, they created an ecosystem of developers using GPUs for machine learning. With Mellanox, they became a (the?) leader in data center networking. They then integrated all of their hardware into servers to offer vertically integrated compute-in-a-box.

When the AI craze started, Nvidia was the best-positioned company in the world to take advantage of it: a monopoly on the picks and shovels of the AI gold rush.

That led to the rise of Nvidia as one of the most successful companies ever to exist.

With that rise came competition, including from its biggest customers. Tens of billions of dollars have flowed into the ecosystem to take a share of Nvidia’s dominance.

This article will be a deep dive into that ecosystem today and what it may look like moving forward. A glimpse at how we map out the ecosystem before we dive deeper:

  • To read the entire piece, consider supporting the Newsletter for less than $2 a week.

?? Generative Value Newsletter ??


A mental model for the AI semiconductor value chain. The graphic is not exhaustive of companies and segments.

2. An Intro to AI Accelerators

At a ~very~ high level, all logic semiconductors have the following pieces:

  1. Computing Cores - run the actual computing calculations.
  2. Memory - stores data to be passed on to the computing cores.
  3. Cache - temporarily stores data that can quickly be retrieved.
  4. Control Unit - controls and manages the sequence of operations of other components.

Traditionally, CPUs are general-purpose computers. They’re designed to run any calculation, including complex multi-step processes. As shown below, they have more cache, more control units, and much smaller cores (Arithmetic Logic Units or ALUs in CPUs).

Source: https://cvw.cac.cornell.edu/gpu-architecture/gpu-characteristics/design

On the other hand, GPUs are designed for many small calculations or parallel processing. Initially, GPUs were designed for graphics processing, which needed many small calculations to be run simultaneously to load displays. This fundamental architecture translated well to AI workloads.

Why are GPUs so good for AI?

The base unit of most AI models is the neural network, a series of layers with nodes in each layer. These neural networks represent scenarios by weighing each node to most accurately represent the data it's being trained on.

Once the model is trained, new data can be given to the model, and it can predict what the outputted data should be (inference).

This “passing through of data” requires many, many small calculations in the form of matrix multiplications [(one layer, its nodes, and weights) times (another layer, its nodes, and weights)].

This matrix multiplication is a perfect application for GPUs and their parallel processing capabilities.

(Stephen Wolfram has a wonderful article about how ChatGPT works.)

The GPU today

GPUs continue to get larger, with more computing power and memory, and they are more specialized for matrix multiplication workloads.

Let’s look at Nvidia’s H100 for example. It consists of CUDA and Tensor cores (basic processors), processing clusters (collections of cores), and high-bandwidth memory. The H100’s goal is to process as many calculations as possible, with as much data flow as possible.

Source: https://resources.nvidia.com/en-us-tensor-core

The goal is not just chip performance but system performance. Outside of the chip, GPUs are connected to form computing clusters, servers are designed as integrated computers, and even the data center is designed at the systems level.

Training vs Inference

To understand the AI semiconductor landscape, we have to take a step back to look at AI architectures.

Training iterates through large datasets to create a model that represents a complex scenario, and inference provides new data to that model to make a prediction.

https://www.dhirubhai.net/pulse/difference-between-deep-learning-training-inference-mark-robins-mdq8c/

A few key differences are particularly important with inference:

  1. Latency & Location Matter - Since inference runs workloads for end users, speed of response matters, meaning inference at the edge or inference in edge cloud environments can make more sense than training. In contrast, training can happen anywhere.
  2. Reliability Matters (A Little) Less—Training a leading-edge model can take months and requires massive training clusters. The interdependence of training clusters means mistakes in one part of the cluster can slow down the entire training process. With inference, the workloads are much smaller and less interdependent; if a mistake occurs, only one request is affected and can be rerun quickly.
  3. Hardware Scalability Matters Less - One of the key advantages for Nvidia is its ability to scale larger systems via its software and networking advantages. With inference, this scalability matters less.

Combined, these reasons help explain why so many new semiconductor companies are focused on inference. It’s a lower barrier to entry.

Nvidia's networking and software allow it to scale to much larger, more performant, and more reliable training clusters.

On to the competitive landscape.

3. The AI Semiconductor Landscape

We can broadly look at the AI semiconductor landscape in three main buckets:

  1. Data Center Chips used for Training
  2. Data Center Chips used for Inference
  3. Edge Chips used for Inference

Visualizing some of those companies below:

Read the entire article here.


OK Bo?tjan Dolin?ek

回复
Hugo Rauch

Finance at Microsoft | Climate and VC | Founder at VCo2 Media

1 个月

I love to see you both collaborate!

回复
Rubén Domínguez Ibar

VC Investor | Actionable insights on startups, innovation, and entrepreneurship

1 个月

This is a great article, congrats!

回复
Ivan Landabaso

Partner at JME.vc

1 个月

Interesting

回复
Chris Tottman

Partner at Notion Capital

1 个月

Jimmy Acton - check this out ??

回复

要查看或添加评论,请登录

Michael Spencer的更多文章

社区洞察

其他会员也浏览了