AI in the Era of Tech Investing - The Big Picture
Michael Spencer
A.I. Writer, researcher and curator - full-time Newsletter publication manager.
This is a guest post by Eric Flaningam of the Newsletter Generative Value.
?? In partnership with Read Futurist ??
With so many various Newsletters, I wanted to return to my roots and start a blog. This is a passion project in development (Born August 20th, 2024).
Emerging tech, breaking news on the beat of the futurity.
Read Futurist is my new blog, where I will seek to cover Tech & innovation breaking news in a minimalist short-form way. It’s experimental, half-undecided and a passion project starting from zero.
Only those who are crazy obsessed with the future of technology will benefit from it fully.
Where Value Accrues Across the AI Landscape
This is a long read, for the best reading experience go here.
Subscribe for Premium coverage.
August, 2024.
Introduction
I’ve spent the last 12-18 months studying opportunities in the AI value chain across both public and private markets. My goal is to understand where long-term value will be created from AI and what businesses can defend themselves over a 5+ year period in an incredibly competitive market.?
Some background on Generative Value:
Generative Value started out as a mental exercise to study AI, and to determine where value would accrue along its value chain. I’ve come to the opinion that AI is not a new technology, but an evolution of computing. LLMs happen to be one application of that evolution. I think computers themselves are a form of AI, Generative AI is the current phase we’re in, and we’ll undoubtedly have AI innovations in the next two decades that will make LLMs look rudimentary.
Once I came to that conclusion, I decided to take a long-term approach to understanding technology and how to invest in it. Eight months ago, I decided to start writing industry primers with the goal of studying every major industry in tech.?
Since then, I’ve published articles on semiconductors, data centers, the cloud, data, and cybersecurity. When Michael and I decided to team up on a guest post, it provided a good opportunity to provide an overview of technology investing and where value accrues along that value chain. These happen to provide a strong foundation for thinking about where value accrues along the AI Value Chain:
I can summarize my takeaways on value accretion in AI here:
AI applications will ultimately determine the revenue created across the AI value chain. The primary question in AI is this: “What problems is AI solving? How large are the scale of those problems? What infrastructure needs to be in place to support those applications?”
Thus far, we’ve seen the vast majority of revenue in AI infrastructure. Candidly, most of that has gone to Nvidia. (OpenAI is at a $3.4B run rate; in comparison, Nvidia did $26B in revenue last quarter.) Even with the talks of the looming AI data center buildout, these data centers take time to build, and we haven’t seen large-scale revenue flowing to storage and networking providers YET.?
We’re seeing the start of those investments now as the hyperscalers ramp up Capex for data center spending:
However, as Satya noted on the last earnings call, about 50% of this spend is on land, leases, and construction. The other 50% will be customer demand-driven for GPUs, networking, storage, etc.
This article will be an overview of the tech investing landscape (at least what I’ve covered thus far on Generative Value…consider this an overview of Generative Value to date).?
I haven’t broken down many AI application markets yet because it’s unclear to me what sustainable value creation looks like in these markets. Additionally, I speculate that a large % of value generation will come from cost savings (example here.), I’m still thinking through the implications of that reality.
So we’re at a time in technology markets where AI has been the focal point of attention for over a year now. Nvidia has seen a huge amount of revenue to build out the infrastructure for AI applications. Now, we’re just waiting to see what the value of those applications will look like.?
The markets I’ll be covering in this article are as follows:
My goal for this article is to break down the tech investment landscape and provide a summary of the articles I’ve published thus far on the intersection of technology and investing.
The basis of modern technology is compute power, the ability to make computations on data to automate tasks. Semiconductors are at the heart of that compute power.?
This industry will do well as long as we’re in the semiconductor age. I think it’s fair to say it is the most important in the world and will likely soon be the largest.?
The industry can be visualized here:
The industry can broadly be broken down into design and manufacturing. Prior to the late 1980s when TSMC was founded, most semiconductor companies did both. These were (and still are) called integrated device manufacturers. The most famous example is Intel, and others include Samsung, Texas Instruments, Analog Devices, Micron, and SK Hynix.?
Outside of the integrated device manufacturers, design firms operate on a fabless model. Nvidia and AMD design their chips using EDA software from Synopsys and Cadence, then hand those designs off to a foundry like TSMC for manufacturing. Those foundries buy hyper-specialized equipment (semiconductor capital equipment) to manufacture the chips.?
The output is one of five or six main types of chips: CPUs, GPUs, Memory, Analog, or Application-Specific Integrated Circuits.
In future articles, I’ll break down EDA software and some of the large fabless chip companies. My first industry breakdown was of the semiconductor capital equipment industry.
The semiconductor capital equipment companies provide the equipment that manufactures semiconductors. The most well-known example is ASML’s lithography machines. Other segments include deposition, etch, process, modification, testing, and packaging. Those steps summarized go like this:
The wafer will then be cleaned and polished before receiving another layer of materials. For complex chips, this will be done between 40-100 times. After the wafer is finished, it will be sliced into dies and packaged into semiconductors. Throughout the entire process, machines from KLA will inspect the wafers for defects.?
Applied Materials has the broadest offering of equipment on the market, and can be considered a “do-it-all” provider. Tokyo Electron has a similar strategy, each each TEL and AMAT lead in specific verticals. In 2013, these companies tried to merge which would’ve created an incredibly dominant semicap company. However, the DoJ shot it down for anticompetitive concerns.
Lam Research is the last of the big 5 wafer fab equipment companies and is the market leader in etch tooling.
The semicap industry has some of the deepest competitive advantages in technology. At the leading edge of semiconductors, each machining vertical is a monopoly or oligopoly. The technology is so complex that an increasingly small number of firms have been able to deliver the machines necessary to manufacture chips.
This has led to these firms showing strong gross margins, returns on capital, and cash returns to shareholders. Adding this to the long-term exposure to the growth of semiconductors, and the industry hass one of the more compelling structures in the markets.
2. Data Centers
Those semiconductors will be housed in one of two places: on the edge (phones, devices, cars, IoT devices) or in data centers. The data center industry is facing a looming buildout with hundreds of billions of dollars of incoming investment.
We can visualize what that value chain looks like here:
Data centers can be broken down into four main segments:
Compute refers to the GPUs and CPUs that run the processing in data centers. For AI, they run the training and the inference workloads. They’re the highest area of value added in the data center, and it’s why Nvidia’s chips are in such high demand. This also includes AI Accelerators like Google TPUs, Amazon’s Trainium, and Microsoft’s Maia.
Networking components like switches, interconnects, and routers connect the semiconductors with storage and ultimately deliver the compute power outside of the data center. Two primary technologies lead for networking equipment: ethernet and Infiniband. Cisco and Arista lead in ethernet and Nvidia is by far the market leader in Infiniband networking. Infiniband is the common networking standard used in AI workloads, but that’s also influenced by Nvidia’s dominant position in GPUs.?
Storage, as implied, stores data for both long-term storage and for short-term retrieval like AI workloads. Market leaders include Dell, HPE, NetApp, Pure Storage, and recently Vast Data (valued at $9B this year).?
Finally, components like energy, power management, cooling, server manufacturers, and data center operators are essential for AI to function properly.
Increasingly, this foundational element of energy is becoming the bottleneck for data centers:
Mark Zuckerberg discussed this on the Dwarkesh podcast:?
“There is a capital question of at what point it stops being worth it to put the capital in…But I actually think that, before we run into that, we're going to run into energy constraints…I think we would probably build out bigger clusters than we currently can if we could get the energy to do it.”
The necessary electrical infrastructure buildout is years away, and a problem without an easy solution.??
3. The Cloud
For fans of Clay Christensen’s work, the cloud provides the best example of the Innovator’s Dilemma I’ve seen. It fundamentally changed the way we interact with computing power, significantly lowering the barrier of entry to computing in the process.
Over the last twenty years, the cloud has become the primary means of delivering software. This includes AI.
We can visualize what that value chain looks like here:
The hyperscalers have leveraged their economies of scale to dominate the industry. They act as the middleman for today’s compute power…offering infrastructure, platforms, and applications for customers.?
Cloud software like Snowflake, Databricks, and the hundreds of other SaaS products are then built on top of that cloud infrastructure. One of the interesting competitive dynamics in software is the competition between hyperscalers and their customers. The hyperscalers fundamentally have a lower cost structure, meaning cloud software vendors must provide a significantly better service to compete with the hyperscalers cost advantages and benefits of offering integrated platforms.?
The hyperscalers are some of the best and largest businesses to ever exist, and that doesn’t look likely to change in the near future.?
I consider semiconductors, data centers, and the cloud as “compute infrastructure.” These form the modern backbone for most computing needs. Everything built on top of that is what I consider applications, or mostly software-driven computing.
4. Cybersecurity
Cybersecurity continues to be one of the most important industries in the world. It’s the insurance of the technology world and is essential to quite literally every piece of technology we use. The risk of threats is so high that top-tier security solutions can command a premium price (both for their services and stock valuations).
We can visualize its value chain here:
I think about cybersecurity in three segments:
The edge, as I’m calling it, refers to the perimeter of an organizations’ assets. This includes the users, the devices they use, and the technologies to determine if/how users can access the network. Identity & Access Management (IAM) and Endpoint Security are the two major segments in the edge category.
Traditionally, cybersecurity was a castle and moat architecture - i.e. keep the bad guys out of the castle and you’ll be fine. However, as technologies have become more complex and distributed, network security has expanded as well. At its core, networking is the way to transfer data between devices. The “network” in the context of cybersecurity is a company’s network of connected devices. This includes data centers, cloud environments, applications, offices, and data.
I’m using the term security operations loosely here; I’m including processes that go on throughout the security lifecycle. Generally, these are process tools aimed at the prevention, detection, or response of security incidents. These technologies include managed services, monitoring, governance, security information and event management (SIEM), and security operations and action response (SOAR).?
As AI continues to develop, cybersecurity solutions will have to innovate to meet the unique threats created. The history of cybersecurity has been attempting to keep up with the “bad guys” to ensure consumers can safely use technology. I don’t see a world where that role changes.?
5. Data
Finally, most modern software is built on data platforms. Software is essentially a database, a front end, and a unique application/means of actioning that data.?
AI is no different, although agents have the potential to change the interaction layer with software. High quality data is a prerequisite for training models, and enterprise data management is a prerequisite for customizing models for specific use cases.?
The overall data landscape can be divided into transactional and analytical processes. Transactional data systems sit behind applications and store/interact with data rapidly, while analytical processes store large amounts of data to be analyzed.?
We can break down the data landscape into 5 main segments:
Databases are the backbone of transactional data systems. There are hundreds of databases on the market and several multi-billion dollar database companies (Oracle, MongoDB, and Cockroach Labs; all at three different positions in their lifecycle). Additionally, the hyperscalers all have multi-billion dollar database product lines.?
Broadly, the market can be segmented into two categories: SQL vs. NoSQL and open-source vs. closed-source. When a company decides what database to use for a specific application, it will ask itself what the data looks like and what it needs to do with that data. The answers to those questions will determine its decision.
Analytical systems enable companies to gain insights from their data. They aim to centralize a company’s data, analyze it, and run security/governance checks on it.?
Two leading architectures exist for data analytics: the data warehouse and the data lakehouse. The data lakehouse disaggregates the warehouse into individual components, allowing companies to piece together their preferred technologies (open-source or closed-source).?
The data warehouse can be broken down into three segments:
Data is stored on the backend on storage hardware in datacenters, typically accessed through the cloud. For analytical systems, its common to store data lakes in S3 buckets and data warehouses in SQL databases on the backend. An emerging trend in storage is open table formats like Iceberg, a means of organizing unstructured data.
A much larger % of revenue comes from compute, or query processing. When a user runs a SQL query, a query engine processes the command on the backend, pulls the data, and returns it to the user.?
Finally, services like data catalogs, data observability, data security, access control, data lineage, and data governance finalize data warehouses offerings.
The core value of data warehouses is the ability to centralize a company’s data analytics operations on one platform.
The lakehouse, on the other hand, offers the flexibility to customize a company’s data offerings.
The lakehouse is made up of the same components as the data warehouse: storage, compute, and services. However, companies can choose which tools to implement in each category. So, the key point of the lakehouse is that it’s an architecture, not a product. A company like Databricks offers the ability to integrate those various open-source tools onto one platform.
The most important trend in data is not the data warehouse or the data lakehouse. It’s the consolidation of platforms. Over the last decade, we’ve seen the rise of “the modern data stack.” Companies have come to realize that they prefer to manage a few tools instead of twenty. So data tools are converging into platforms on Snowflake, Databricks, or the hyperscalers.?
When I think about investing across the data landscape, I come back to the question of “what companies solve a big enough problem to justify companies purchasing products outside of the major platforms?”?
As consolidation continues, that’s not an easy question to answer.?
Summarizing the technology landscape:
At one end of the technology landscape, we have applications solving tangible problems in our lives. At the other end we have semiconductors storing and computing data. Then, we have trillions of dollars worth of value created in the technologies connecting those two things.
If there’s one opinion I have on the coming years in tech investing, it’s that the companies solving tangible problems at scale will continue to do well. Another way to put it, companies will be valued based on the value they’re generating for their customers. AI has the potential to enable much of this value creation, and those are the opportunities I continue to be excited to invest in.?
As always, thanks for reading!
Trusted Advisor // Banking // Financial Services // Fintech // Angel Investor
1 个月Great summary, thanks for sharing
Enterprise AI Executive: Leadership | Business Development | Strategic Alliances | Channel Sales | Global Partnerships | Ex NTT-Dimension Data-The Revere Group
1 个月Michael Spencer really great view. I would suggest that SambaNova Systems, we replace the need for the "Data Center kit", buy delivering a full stack solution it can be delivered at 10x lower TCO with 10x increase in performance vs what you have outlined in the kit. With full connectivity and hybrid options to the rest of the ecosystem, our time to value is also 10x faster.
Microsoft Cloud for Sovereignty - Azure & AI, former Distinguished Architect CTO Cisco IEEE-802.1 Architect
1 个月GREAT overview, thanks Michael Spencer
President at Futura Automation, LLC
1 个月Very nice summary, Michael. I don't see Super Micro, Aurora Innovations (autonomous navigation), Innodata or Palantir on your charts.