got GPUs?

got GPUs?

As investors trip over themselves to tweet (er, X?) the hottest post-game take on the ZIRP era, the post-ZIRP era, the death of crypto, and everything in between, the greatest get rich quick scheme is unfolding right under their Santal-filled noses. GPU cloud upstarts like CoreWeave and Lambda Labs are uniquely positioned to take on the hyperscaler incumbents (AWS, GCP, and MSFT) as exponentially stratospheric demand for GPUs overwhelms supply.? Use cases range from ChatGPT to gaming to research to rendering whatever the hell this is:

Courtesy of Midjourney (or however we should cite GenAI renderings).

Consumers and enterprises alike are pushing GPU markets to their limits.? The (investment) equation (for CoreWeave and Lambda et al) becomes increasingly attractive when you consider that every $1 these GPU upstarts spend returns $1 per year for 5 years. (Quick math: that’s a 500% ROI.)? “Meow does this work?” rainbow farting kitten asks. Here is some context….

The cloud: between 2014 and 2023 (i.e., the years that encompassed my tenure at Amazon), AWS revenue went from about $4B to $100B (run rate). And IT spend attributed to cloud went from <10% of total to, I dunno, something much much bigger. (Feel free to chime in with the 2023 apples-to-apples figure - if you act now I’ll send you a “From the Porch” t-shirt!) Why? Because it’s faster, cheaper, and easier to deploy and scale via cloud rather than running your own infrastructure, and also because the compute power and features are far superior when using hyperscalers (like AWS, GCP, and MSFT). A small 1,000 square foot data center would contain almost 1,000 CPUs using 11GWh of power. And hyperscaler data centers can be hundreds of thousands of square feet with hundreds of thousands CPUs doing cloud compute.?

The chips: CPUs from companies like Intel and AMD have and continue to power the largest proportion of data center servers (as well as PCs etc.) As per above, large data centers contain hundreds of thousands of CPU-based servers. CPUs are great at general purpose tasks like running operating systems and applications and they’re pretty good at math too.?

By contrast, GPUs, introduced in the 1990s and popularized by companies like Nvidia (and ATI, which was acquired by AMD), are great at parallel processing; i.e., breaking down tasks into smaller operations executed simultaneously. This makes them particularly well-suited for graphics rendering (thus the “G” in GPU) and, consequently, gaming. In fact, gaming (and other rendering) was the predominant market for GPUs for years until Nvidia introduced CUDA in 2007, which let developers access GPU resources to do tasks other than graphics rendering. Turns out, these GPUs are pretty awesome for blockchain computation, machine learning, and artificial intelligence workloads. They still require a CPU to operate; i.e., every GPU is attached to a CPU (but not vice versa). I peg the “attach rate” at about 10% of CPUs shipped are connected to a GPU today (40M GPUs of the 400M CPUs shipped per year) and I suspect that rate will go up (a lot).

The providers: AWS, GCP, and MSFT (the hyperscalers) control >>60% of the cloud computing market, which is currently dominated by CPU-based server technology. Given the sudden insatiable demand for higher performance compute for things like AI, the hyperscalers have been making GPU server instances available in their existing data centers. Given GPU (chip) pricing and scarcity, GPU cloud pricing is much higher than CPU compute. For example, CPU pricing is <<$1/hour (and as low as ~$0.05/hour) while GPU pricing can average around ~$5/hour (or more, depending upon the specific instance / chip type).

The upstarts: GPU upstarts like CoreWeave and Lambda offer (mostly only) GPU instances from their (mostly only) GPU cloud infrastructure, which, as luck (and math and physics) would have it, allows them to offer (crazy) competitively-priced alternatives to hyperscaler GPU instances. Turns out that a data center designed specifically for GPU cloud is much more efficient at running AI workloads than one originally designed for predominantly CPU-based workloads. Case in point, a pure CPU data center (using CPUs) can train only 1 model for $10M versus 44 models (for a pure GPU data center) for the same price and actually less power consumption - at least according to Nvidia. So when you notice that Nvidia H100 instances from CoreWeave and Lambda are ~$5 / hour or lower (i.e., ~50% below GPU instances from the hyperscalers), that’s (part of) why.

$how me the money….

CoreWeave and Lambda Labs were originally built for cryptomining and selling powerful gaming and research PCs, but both companies pivoted on a dime to the more lucrative GPU cloud infrastructure market.? In 2022, neither had any material revenue (if at all) from GPU cloud, but Lambda was said to have exited 2023 with $150M in run rate GPU cloud revenue, and CoreWeave is rumored to have exited 2023 with over $500M in revenue and another $2B locked in for 2024. They're both serious players now in the most important, fastest growing, highest margin part of the cloud computing market. (Respect.)

The hyperscalers continue to invest billions of dollars per year in cloud infrastructure and, specifically, to add GPU products to their existing infrastructure. But the pure GPU cloud upstarts, with their purpose-built GPU cloud infrastructure and ostensibly more efficient data center design, have been able to build war chests as well. Lambda was reported to be raising a $300M round at a valuation of $1B or more. CoreWeave not only raised over $400M in equity (including a slug of money from Nvidia itself), but also raised $2B+ in debt to finance their build out. That $2B+ was collateralized by these scarce supercomputer-power GPUs as well as guaranteed contracts from companies like Microsoft to purchase GPU cloud service. Meanwhile, DigitalOcean shelled out $111M cash to acquire Paperspace; and a slew of other companies are popping up with analogous plans (e.g., Crusoe Energy, Arkon Energy, etc.) The market is ripe for upheaval.

Are these upstarts and their investors crazy? Maybe. But they also seem to be pretty good at economics. Here’s why….

An Nvidia H100 GPU costs about $30K.? Add another ~15% on top for service, maintenance, additional infrastructure, etc. You’ll need 8 of those GPUs in a server box so a single box will cost you ~$276K:

Not cheap. But… that box you bought (for ~$276K) will generate revenue of ~$262,800/year for five years:

This assumes “sticker price” of an H100 instance gets discounted 25% to ~$5/hour (for an on-demand instance) and you run that instance for 730 hours per month at 75% utilization. TLDR: you breakeven year one; everything on top of that is gravy (or barbecue sauce, Ted). So if you can generate revenue from that instance for 5 years you’ll get >$1M over five years for your ~$276K investment.?

You don’t need to shell out 100% of that (~$276K) box cost upfront; instead, lease it over that five year term and you’re free cash flow positive almost day one; e.g., at a 10% lease rate over that period, you’ll generate >$170K in year one free cash flow (as well as in years 2, 3, 4, and 5) so still near $1M in free cash flow over the term of the lease.

You can reinvest the free cash flow generated each year from that server into building out more infrastructure so that revenue and free cash flow starts to get quite big. Now go ahead and scale up that single box investment to $100M or $400M or over $2B. The numbers are… compelling; e.g., below is the model for a $100M investment….

Are these numbers right? Probably not. But they’re directionally accurate. After all, a capacity glut likely is coming, which clearly would put a lot of pressure on pricing. But we can stress test the model a bit. Here’s how revenue changes (on the $100M investment) depending upon price per hour and utilization rate….

And here’s what free cash flow looks like….

You’ll notice that even at $2/hour, which is around what Lambda charges at the low-end and also what CoreWeave is rumored to be getting from its Microsoft commitment, you’re profitable at just over 60% utilization; and rates this low typically are for reserved capacity, which, by definition, has 100% (revenue) utilization.

There are also reasons to believe pricing could hold up given the insatiable demand for GPU compute, especially if you subscribe to the thesis (like I do) that LLMs (and analogous evolved models) increasingly become the core compute engines of modern applications. This doesn’t mean CPUs are going away. After all, every GPU requires a CPU to operate and right now roughly 10% of the 400M GPUs sold per year are attached to GPUs. But that 10% number will go up and CPU pricing will come down even more quickly than GPU pricing as it commoditizes (even) more. So we’re likely to see a time where dollars spent on GPU compute exceeds dollars spent on CPU compute even though the shipping volume will show the opposite. (Fwiw, both the global microprocessor and DRAM markets are ~$100B per year and there was a time when MPUs were a niche market….) But even if and when GPU pricing decreases, history and my “LLMs everywhere!” thesis above suggest that volume (and margin) will continue to support top line and free cash flow growth for these businesses.?

So… who’s got GPUs?

The field is about to turn into an ass-to-ankles gladiator arena.? Aside from incumbents that offer GPUs (i.e., the hyperscalers and our new GPU cloud upstart friends), other cryptominers are following CoreWeave’s lead and offering GPU capacity to willing clients. And so are energy companies building high density energy and heat efficient data centers. But what’s stopping (cloud) gaming providers, media and media-related companies (e.g., graphics and video editing, film and video studios, animation studios), telecom companies, researchers (scientific, healthcare, and others), and, of course, adult entertainment companies, who were, as usual, early adopters of new technology and maintain high performance infrastructure to support video rendering, streaming, and virtual reality workloads. The list goes on. After all, selling access to infrastructure to other Internet companies worked out pretty well for one online bookstore I know (and by whom I was once employed). Will the next hot AI infrastructure company pop up from one of these spaces?? If you’ve got GPUs (or capital)… let’s talk.


Thanks for sharing... I wrote about this trend a few days ago - if anyone is interested to read along: https://medium.com/design-bootcamp/the-invisible-empire-compute-powers-quiet-conquest-of-society-ff2983396bd5

回复
Holly Wilkinson

Social Media Executive at NexGen Cloud

10 个月

Interesting read! Check out Hyperstack. H100's, A100's and more are available on-demand at https://lnkd.in/dHSMsQaY ??

Mark Hinkle

I publish a network of AI newsletters for business under The Artificially Intelligent Enterprise Network and I run a B2B AI Consultancy Peripety Labs. I love dogs and Brazilian Jiu Jitsu.

10 个月

Really interesting read and deep dive on the hardware. This is fascinating - "But… that box you bought (for ~$276K) will generate revenue of ~$262,800/year for five years"!!!!

要查看或添加评论,请登录

社区洞察

其他会员也浏览了