Emerging Cloud Infrastructure Trends: Part 2

Emerging Cloud Infrastructure Trends: Part 2

Key insights from industry leaders shaping the future of Edge Computing, AI Hardware, Databases & Data Center Energy Usage.

Thank you for your expert insights! Chris Miller (Author of Chip War), David Stout (CEO of WebAI), Evan Caron (Co-Founder of Montauk Climate), Joselyn Lai (CEO of Bedrock Energy ), Kelly Toole (Product at ClickHouse ), @krishna say (CEO of SiMa.ai), Kurt Busch (CEO of Syntiant Corp. ), Mitchel Tsing (Founder of InchFab ), Philip Krim (Co-Founder of Montauk Climate), Philip Rathle (CTO at Neo4j ), PJ Jamkhandi (VP of Finance at d-Matrix ), Reiner Pope (CEO of MatX ), and Tyler Mauer (CTO of webAI ).

Thank you Tejas Pruthi for your help researching this!

Introduction

Following our earlier exploration of cloud infrastructure, databases, and artificial intelligence (linked here), we're excited to present Part 2 of our Cloud Infrastructure research.

During our research, we stumbled upon a fascinating term: nephology - the scientific study of clouds. Traditionally reserved for meteorologists studying atmospheric phenomena, we believe a new form of nephology has emerged. Call it Nephology 2.0: the systematic study of how digital clouds are reshaping our technological landscape.

This research represents our journey into four critical areas transforming cloud infrastructure:

  • Edge AI: The shift of computation from centralized clouds to distributed "sensing" networks
  • AI Hardware: The emergence of specialized chips challenging traditional GPU dominance
  • Databases: The evolution of data storage systems in an AI-first world
  • Data Center Energy: The quest for sustainable power in an era of exponential AI growth

It happened by accident, but this research ventures slightly beyond pure software infrastructure, it reveals a crucial intersection: where hardware meets software in building tomorrow's cloud. Through extensive market analysis and direct conversations with industry leaders, we've uncovered several key trends shaping 2024 and beyond.

Unlike some of my other posts, this research is more about learning in public. I hope you learn with me. By exploring these areas, we hope to provide some valuable insights for investors, technologists, and industry leaders alike.

In the following sections, we'll unpack our findings on each of these critical categories, highlighting key trends, challenges, and potential opportunities that are shaping the next wave of cloud infrastructure innovation in 2024 and beyond.

Key Trends and Takeaways for 2024 (and beyond)

  1. The Decentralization of AI - From Cloud to Edge: The cloud infrastructure landscape is experiencing a fundamental shift - toward decentralized AI computing... No, not web3. Today's centralized cloud infrastructure, while efficient for early generative AI tasks, cannot meet the demanding requirements of next-generation AI applications. Edge AI emerges as the solution, bringing scalable AI systems closer to where they're needed through smaller, specialized models distributed across computing nodes.?
  2. The Rise of Specialized AI Hardware: The monopoly of general-purpose GPUs faces a new challenge. A wave of innovative companies is developing chips optimized for specific AI workloads, particularly inference tasks. This specialization isn't just about performance - it's about fundamentally rethinking how AI hardware scales and operates.
  3. The Database Revolution: In our research, one question kept emerging: which databases will thrive in the AI era? The answer reveals a shifting landscape. Vector databases are taking center stage, offering efficient handling of high-dimensional data crucial for AI applications. Meanwhile, graph databases have carved out their own critical niche, excelling at managing the complex relationships and interconnected data that power AI-driven analytics and knowledge graphs.?But the evolution goes deeper than new database types. We're witnessing the rise of hybrid solutions that blend traditional database functionality with AI-specific features.
  4. Sustainable Power for AI: As AI workloads grow exponentially, a critical question emerges: how do we power this new wave of computing? The answer might lie beneath our feet. Geothermal energy, particularly geo-exchange technology, is emerging as a viable option for data centers, offering consistency and scalability where cooling demands are high. But that's not the only solution gaining traction. Nuclear energy, including Small Modular Reactors (SMRs), is re-emerging in the conversation around data center power.

Trend 1: Edge Computing / Edge AI

Two years ago, I had lunch with Chris Miller, author of the #1 NY Times bestseller "Chip War." (link to book here). His prescient insights about the semiconductor industry and the growing need for GPUs were striking. As our meal concluded, I asked him, "What's next?" Without hesitation, he replied, "Edge."

What is Edge AI?

Unlike traditional cloud-based AI, Edge AI brings computation directly to our devices. Krishna Rangasayee, CEO of SiMa.ai (a Machine Learning System-on-Chip (MLSoC) company), frames the shift clearly: "The first wave of generative AI happened in the cloud, in mostly the form of consumer-like experiences. The second—and more meaningful—will happen at the edge; We know the cloud can't handle the scale, energy efficiency and performance required to succeed in supporting the mission critical work happening at the edge."

Chris Miller elaborates on the core benefits: "The benefit of computing on the edge is that you get privacy benefits, latency benefits, and efficiency benefits because you don't need to send data to a data center. A lot of the capabilities of LLMs that we initially thought would require very big models can now be shrunk down and possibly computed on your phone."

Unlike traditional AI models that rely on cloud-based processing, Edge AI brings computation directly to local devices. This shift allows everything from your thermostat to your smartwatch to run AI models on-device, offering significant advantages in privacy, latency, and efficiency.

Edge AI vs. IoT: A Key Distinction

While Edge AI and IoT (Internet of Things) share similar spaces, they differ crucially. Edge AI aims to remove reliance on internet connectivity by running AI models locally—critical for applications requiring split-second decisions.

Kurt Busch, CEO of Syntiant (a leading AI edge chip company backed by Microsoft's M12, Intel Capital and Amazon's Alexa Fund), sees Edge AI as "the new interface between the real world and the digital realm." His company focuses on "battery-powered devices, aiming to run continuous AI on 12-volt batteries."

This approach opens new possibilities for AI integration in everyday objects. Krishna stated: "AI/ML at the edge must be deployed with a versatile platform that supports a variety of modalities across all use cases, ranging from computer vision, to transformers to multimodal generative AI, in preparation for generative AI's move to the edge."

This approach opens up new possibilities for AI integration in everyday objects, from smart home devices to wearable technology. Imagine a world where your fitness tracker not only monitors your vitals but also provides real-time health insights and personalized workout suggestions, all without needing to connect to the cloud.

Perhaps the most compelling use case for Edge AI is in self-driving cars. Chris Miller elaborates: "Autonomous driving might be one of the clearest examples where edge computing shows its strengths—real efficiency gains can be made with chips specifically designed for self-driving applications."

The ability to process vast amounts of sensor data locally, with minimal latency, is critical for the safety and performance of autonomous vehicles. Edge AI enables split-second decision-making that could mean the difference between avoiding an accident and causing one.

Democratizing Silicon: Silicon as a Service

Mitchel Tsing, founder of Inchfab (a boutique silicon-foundry), is working to democratize access to advanced manufacturing capabilities. "We're not trying to be the next TSMC; they are masters of scale. What we offer is agility and accessibility," Tsing explains. "TSMC is optimized for the Apples and Qualcomms of the world. We're optimized for the next big thing, whether that's a startup or a new tech spin-out."

This "Silicon as a Service" model makes cutting-edge chip manufacturing accessible to a broader range of companies, potentially accelerating innovation in the Edge AI space.

The Evolution of Local AI

The concept of Edge AI is expanding to include a broader definition of local AI, encompassing not just individual devices but also on-premise inference clusters.

David Stout and Tyler Mauer, CEO and CTO of WebAI (a local AI company), explain: "Local AI includes Edge in the sense that if your phone or your device is able to run the AI, you do run it locally. Then secondly, if you can't run it on your device specifically, you run it on a separate device that you control and own that is local."

This broader definition of local AI opens up new possibilities for businesses and organizations. It allows them to leverage existing hardware investments while maintaining control over their data and AI processes, addressing concerns about data privacy and sovereignty that have become increasingly important in our digital age.

Tyler from WebAI introduces an intriguing concept: From Monolithic to Mosaic. This phrase describes the shift from large, centralized AI models to more modular, distributed models that work together like pieces of a mosaic. It suggests a future where AI is more adaptable and scalable, leveraging smaller, specialized models instead of relying solely on massive, monolithic ones.

The Economic Imperative

The rise of Edge and local AI solutions is also driven by economic factors. Tyler points out that "AI is costing a fortune," referring to the high costs associated with running large AI models in the cloud. Local AI aims to reduce these costs by leveraging existing, locally available hardware.

Moreover, Tyler argues that "Cloud isn't capable of running the really valuable use cases that AI is actually set to solve." He provided a vivid example: "You're not going to be able to stream a car's visual data up to the cloud to make determinations and then route it back down with low enough latency to avoid hitting something."

Challenges and Opportunities

Despite impressive growth, deploying Edge AI faces significant challenges. Kurt Busch of Syntiant frames the core hurdle succinctly: "The hurdle is achieving cloud-level performance in local devices." This challenge demands specialized hardware capable of handling sophisticated AI models locally - no small feat given the computational demands of modern AI.

Yet companies are rising to meet this challenge head-on. Syntiant stands out by positioning itself as "The Nvidia of Edge." Busch explains their advantage comes from flexible hardware that stays "one generation ahead due to early insights from software," enabling them to implement advanced features like attention layers and transformer support before competitors enter the market.

As Edge AI technology matures, its impact on our technological landscape grows increasingly clear. We're witnessing the emergence of a new computing paradigm where devices become not just smarter, but truly autonomous. This shift addresses the latency issues that have long plagued cloud-dependent applications, opening doors to use cases that were previously impossible.

The implications reach far beyond simple convenience. Edge AI is poised to transform how we interact with technology at a fundamental level. Imagine digital assistants that understand context instantly, without cloud connectivity. Picture smart cities that optimize traffic and resources in real-time, or agricultural systems that make split-second decisions based on local sensor data. In healthcare, Edge AI could enable devices that deliver personalized insights immediately, while in industrial settings, it promises to revolutionize predictive maintenance through instantaneous, on-site analysis.

As Mitchel Tsing of Inchfab aptly observes, "Edge computing isn't just a trend; it's a fundamental shift in how we think about data and processing." His insight captures the essence of this transformation - we're not just improving existing systems, we're fundamentally reimagining how computing integrates into our world. Edge AI isn't a distant future; it's actively reshaping our digital landscape today, bringing intelligence closer to where decisions matter most.

Trend 2: AI Hardware

The landscape of AI hardware is experiencing a seismic shift. While Nvidia's GPUs have reigned supreme in the first wave of AI computing, a new generation of chip companies is emerging with a bold thesis: the future of AI processing won't be one-size-fits-all.

Chris Miller, having chronicled the semiconductor industry's evolution in "Chip War," sees this market disruption through a historical lens. He highlights the delicate balance these new entrants must navigate: "Companies must carefully weigh the benefits of specialization—achieving superior efficiency for specific tasks—against the risks of limiting their addressable market."

This tension between specialization and market breadth isn't just theoretical. As AI workloads become more diverse, from large language models to computer vision, the demand for purpose-built silicon grows. The market is responding with a wave of innovative architectures optimized for specific AI tasks—inference engines that run more efficiently than general-purpose GPUs, chips designed specifically for transformer models, and processors built from the ground up for edge deployment.

The GPU Dominance and Its Challengers

While Nvidia's GPUs and their CUDA platform have dominated AI computing for years, the industry is reaching an inflection point. As AI models grow increasingly complex, the limitations of traditional GPU architectures—particularly in power consumption and performance—are becoming impossible to ignore.

Enter d-Matrix (a company developing in-memory computing chips for AI inference workloads), which is pioneering a fundamentally different approach. PJ Jamkhandi, VP of Finance at d-Matrix, frames the challenge clearly: "GPUs are exorbitantly expensive, so that is not an architecture many want to gravitate towards for inference. What we provide is a different approach—one that efficiently balances memory, compute, and power."

What makes d-Matrix's solution unique is their innovative integration of memory and compute, specifically optimized for AI inference workloads. This stands in stark contrast to traditional GPU architectures, which rely heavily on a combination of compute units and high-bandwidth memory (HBM) primarily designed for training tasks.

But the landscape is shifting in another crucial way. PJ highlights an emerging trend that's reshaping hardware requirements: "Fundamentally, when we think of models, we often view the training of models and the deployment of those models in production as two distinct worlds, right? But what's happening now is the concept of fine-tuning is starting to gain significant traction. Inference hardware is now expected to handle fine-tuning as well."

This convergence of training and inference capabilities points to a future where AI hardware must be more versatile while maintaining efficiency—a challenge that's driving innovation across the industry.

The Rise of Specialized AI Chips

While Nvidia's GPUs remain the gold standard for FLOPS per dollar, a new wave of innovation is reshaping the AI hardware landscape. Application-Specific Integrated Circuits (ASICs) are emerging as powerful alternatives, offering dramatic improvements in both performance and energy efficiency for specialized AI tasks.

Consider Etched (designer of the Sohu chip optimized for transformer models), which claims performance up to 20 times faster than Nvidia's H100 GPUs while maintaining lower power consumption. Similarly, Groq (developer of the Language Processing Unit) has carved out its niche by optimizing specifically for inference workloads.

Among these innovators, MatX (an AI hardware innovator optimizing performance and efficiency) stands out with an intriguing approach. Founded by former Google engineers, the company brings deep expertise in both hardware and software development to the challenge. We spoke with CEO Reiner Pope, whose background in AI software complements his co-founder's hardware design expertise.

MatX's strategy focuses on radical simplification—designing chips exclusively for LLM processing. Pope's perspective on the market is refreshingly direct: "We haven't seen anyone beat Nvidia on the metrics that matter the most, which is the Flops per dollar." This focus on economic efficiency rather than raw performance metrics sets MatX apart in an increasingly crowded field.

When we asked Pope about potential advancements by incumbents and the evolving needs of AI developers, his response pointed to a crucial tension in the industry—balancing innovation with practical market demands.

When asked about navigating incumbent competition and evolving customer needs, Pope offered a nuanced perspective on MatX's strategy: "There are two sides to that: the competitors and the customers. On the competitors' side, you can model out where they will go. For Nvidia, you can model it out. It's true for Google as well. With incumbents, we have a pretty clear roadmap. We can project forward based on what they've said and done. On the customers' side, it's a question of what if the workload changes. What if it's not transformers but something else? That's the joy of designing hardware—wrestling with this problem. The operating point we've targeted is to focus on large matrices."

This focus on large matrices isn't just a technical detail—it's a strategic cornerstone of AI hardware design. At their core, these matrices are the fundamental building blocks of modern AI, powering everything from deep learning to transformer models. Think of them as the engine room of neural networks, where the real computational heavy lifting happens.

Why do matrices matter so much? AI-specific semiconductors—whether GPUs, TPUs, or custom ASIC chips—are essentially massive parallel computing machines designed to handle these matrix operations with remarkable efficiency. They achieve this through a combination of:

  • Multiple processing cores running in parallel
  • High-bandwidth memory systems
  • Specialized circuits optimized for matrix math

This architecture choice reflects a careful balance between current needs and future flexibility. By optimizing for matrix operations, hardware manufacturers can deliver significant performance improvements for today's AI workloads while maintaining the adaptability needed for tomorrow's innovations.

Fine tuning

Here's a fascinating insight about the future of AI hardware, shared by PJ Jamkhandi of d-Matrix (a company developing in-memory computing chips for AI inference workloads). He draws a compelling parallel between AI model adaptation and human cognition:

"Inference hardware is now expected to do fine-tuning as well... Instead of retraining and changing all the model weights of your 70 billion parameter model... What you want to do is keep 95% of the model's structure—similar to how the human mind works. We don't erase our entire memory every time we learn something new... you layer on knowledge as you continue to go through life experiences."

This analogy isn't just elegant—it points to a fundamental shift in how we think about AI hardware design. Traditional GPU-based solutions, while powerful, can be excessive and costly for many inference tasks. The future might look more nuanced: hardware specifically engineered to handle on-the-fly fine-tuning, allowing AI models to adapt and learn incrementally, much like our own brains.

This approach could revolutionize how AI systems evolve in real-world applications. Instead of complete retraining—imagine wiping your entire memory to learn a new fact—future hardware could support more efficient, targeted updates to AI models. It's a vision of AI that's not just more efficient, but more natural in its learning process.

Energy Efficiency and Future Innovations

As AI models grow increasingly complex, the industry faces a critical inflection point: energy efficiency isn't just a nice-to-have feature—it's become a crucial differentiator in chip design. The days of brute-force computing power are giving way to more nuanced approaches that balance raw performance with power consumption.

The next wave of innovation looks particularly promising. DARPA's OPTIMA Program exemplifies this shift, developing ultra-efficient AI chips for military and defense applications. Their focus on in-memory computing techniques points to a future where processing power and energy efficiency aren't trade-offs, but complementary goals.

Meanwhile, several breakthrough technologies are emerging:

  • Optical computing: Using light for data processing, promising unprecedented speed and efficiency
  • Neuromorphic computing: Chips designed to mirror biological neural networks, potentially offering more efficient processing for specialized AI tasks
  • Quantum AI: The convergence of quantum computing and AI could revolutionize how we tackle complex computational problems

As PJ Jamkhandi of d-Matrix succinctly puts it: "Silicon is hard and hardware is harder." This observation captures both the challenge and opportunity in AI hardware development.

Trend 3: Database Technology

Remember when databases were just about storing and retrieving data? As artificial intelligence reshapes our technological landscape, the database market is experiencing its own transformation. With projections showing the global database management system market reaching $125.6 billion by 2026 (growing at a CAGR of 12.5% from 2021), the stakes are higher than ever.

While traditional players like Oracle, Microsoft, IBM, and AWS continue to dominate, new entrants such as MongoDB and Snowflake have disrupted the market with fresh approaches to data management. But a more interesting question emerges: which databases will thrive in the AI era?

Two critical questions drive our research:

  1. How will AI transform database performance, management, and functionality?
  2. Which database types will see surging demand as AI applications proliferate?

The AI Challenge to Traditional Databases

The rise of AI is pushing traditional databases to their limits. Today's AI applications demand something different: the ability to process massive volumes of unstructured data—text, images, audio—in real-time. These requirements extend far beyond what conventional database systems were designed to handle.

Enter vector databases, a technology perfectly aligned with AI's unique needs. Companies like Pinecone, Chroma, and Weaviate (leading vector database providers) have emerged to address this gap. These systems store data as high-dimensional vectors, enabling lightning-fast similarity searches and efficient management of vector data at scale.

The team at Pinecone (the leading vector database company) describes a market at an inflection point: "The market's understanding of the importance of vector databases and what they can do, especially with techniques such as RAG, has grown exponentially in the past year. We don't see this trajectory changing any time soon. More and more companies are realizing the benefits of vector search – and specifically purpose-built vector databases – in achieving knowledgeable results to their LLM queries based on their proprietary information."

This surge in vector database represents a fundamental change in how we think about data storage and retrieval in the AI era. Instead of simply storing information, these databases create a form of AI-native memory, enabling systems to understand relationships and similarities in ways that traditional databases never could.

The Power of Graph Databases in the AI Era

In the evolving AI landscape, graph databases have carved out a unique and powerful niche. While traditional databases store data in rows and columns, graph databases excel at something more fundamental: understanding relationships.

Neo4j (the leading graph database company) stands at the forefront of this revolution. Philip Rathle, their CTO, articulates the key difference: "Graph databases don't just store data; they store the connections between data, which is increasingly where the real value lies. Graph databases allow for more intuitive data modeling that reflects real-world entities and their relationships, which is a game-changer for sectors like fraud detection, recommendation engines, and supply chain optimization."

The marriage between graph databases and AI creates a particularly powerful synergy. As Philip explains, "Graph databases and AI complement each other perfectly—AI thrives on large, connected datasets, and graph databases excel at storing and querying these connections." This combination enables breakthrough applications in fraud detection, where AI algorithms can now traverse vast networks of transactions to spot suspicious patterns that would be nearly impossible to detect in traditional database structures.

Think of it as giving AI a map of relationships rather than just a list of facts. In industries where understanding connections is crucial—from financial services to healthcare—this capability isn't just an advantage; it's becoming a necessity.

Emerging Database Technologies and Techniques

Beyond vector and graph architectures, a new category of databases is emerging to meet AI's insatiable appetite for real-time data processing. ClickHouse (a columnar database architecture) exemplifies this evolution, particularly in environments where every millisecond matters.

Kelly Toole, a PM at ClickHouse and former partner at Index Ventures, frames the importance of speed in practical terms: "The difference between milliseconds and seconds? There is the load time for those dashboards. You're not having this customer experience where they're waiting for the loading spinners."

This focus on speed isn't just about user experience—it's about enabling the next generation of AI applications. Whether it's recommendation engines making split-second decisions or predictive models requiring instant data access, the ability to minimize delays has become crucial for AI-driven systems.

Tool highlights a key market dynamic driving adoption: "BigQuery, Redshift, and Snowflake gate critical real-time features behind higher pricing tiers." This pricing model has created an opening for ClickHouse, which offers high performance without the premium price tag, allowing companies to scale their AI operations without watching costs spiral.

What makes ClickHouse particularly valuable for AI workloads is its deep integration with machine learning frameworks. By supporting User Defined Functions (UDFs) and vector operations, the platform enables sophisticated AI tasks—like semantic search and text embeddings—to run directly within the database. This integration eliminates the complexity of managing multiple systems, streamlining AI workflows and reducing operational overhead.

For machine learning systems that depend on continuous learning, ClickHouse's ability to seamlessly integrate fresh data becomes a crucial advantage, ensuring models stay accurate and relevant in real-time.

The evolution of search technology is reaching a new frontier. Hybrid search engines are emerging, combining the precision of traditional keyword search with the intuitive understanding of vector-based approaches. Think of it as giving search engines both analytical and intuitive capabilities—like having both a dictionary and an understanding friend to help you find what you're looking for.

While keyword search excels at finding exact matches in structured data, vector search brings a deeper level of comprehension, understanding semantic relationships and context. This combination proves particularly powerful when users explore complex topics without knowing the exact keywords they need, or when searching across diverse data types like text, images, and metadata.

Parallel to this search revolution, a new category of specialized data management systems is gaining prominence: machine learning feature stores. Two companies stand out in this space: Tecton (a feature platform for machine learning teams) automates the journey from raw data to production-ready features, enabling efficient, low-latency AI applications through sophisticated data pipelines that handle real-time, batch, and streaming data. Feast (an open-source feature platform) focuses on streamlining machine learning by managing feature serving across both offline and online environments.

These feature stores solve a crucial challenge in AI development: ensuring consistency and quality across different models and applications. Think of them as central kitchens for AI—where ingredients (features) are prepared, stored, and served when needed. Perhaps most importantly, feature stores are becoming collaboration hubs, providing data scientists and engineers with a shared repository for feature definitions and metadata—essentially creating a common language for AI development teams.

As Philip from Neo4J aptly puts it, "Data's made this gradual move over the last decade or two from being a cost center to being viewed as the basis for not only a profit center but actually driving a lot of what companies do." This shift underscores the transformative power of data in the AI era.

Trend 4: Data Center Energy Usage

Behind every AI model, streaming service, and cloud application lies a vast network of data centers—massive facilities consuming energy at staggering rates. As AI capabilities expand, a crucial question emerges: how will we power the future of computing?

The industry stands at a critical juncture. With major tech companies pledging carbon neutrality by 2030, the challenge isn't just about providing more power—it's about reinventing how we think about energy in the digital age.

Philip Krim and Evan Caron, founders of Montauk Climate (an incubator for sustainable ventures), identify a watershed moment: "The market has seen cracks... I think the lid got blown off with Sam Altman and Jensen saying, 'Hey, we're not gonna have enough power to do any of this stuff.' It's the second or third coming of a very old business model... but this time, it's powered by AI and cloud."

Their perspective goes beyond traditional data center management: "We have a cradle to grave view of everything from energy generation to consumption... data centers are no longer just about computing power, they're about power, period."

Caron highlights the industry's transformation: "You're just seeing big changes in the fundamentals and the dynamics of the market and new players coming in with a much larger energy usage … combined with a historical, sleepy, semi-regulated/pseudo-regulated industry."

This convergence of AI's explosive growth and energy constraints raises a fundamental question: what is the future of energy going to look like?

Nuclear Energy: A Resurgence

A new solution to AI's voracious energy appetite is emerging from an unexpected source: nuclear power. But this isn't your grandfather's nuclear plant—we're witnessing the rise of Small Modular Reactors (SMRs), a revolutionary approach to nuclear energy that could transform how we power our digital future.

The momentum is building rapidly. In October 2024, the U.S. Department of Energy made a decisive move, announcing up to $900 million in funding to accelerate SMR deployment. This milestone-based program targets two critical areas: initial deployments and supply chain improvements, aiming to navigate the complex regulatory landscape of nuclear energy.

Tech giants aren't just watching—they're leading the charge:

Amazon has embarked on multiple strategic initiatives:

  • Partnering with X-energy to develop SMR units in Washington State
  • Collaborating with Dominion Energy to explore SMR development near North Anna Power Station
  • Acquiring a nuclear-powered data center from Talen Energy These moves align with Amazon's ambitious Climate Pledge to achieve net-zero carbon by 2040.

Google isn't far behind, signing a landmark deal with Kairos Power to secure up to 500 MW of SMR energy by the 2030s, while Microsoft has taken a bold step into fusion energy, partnering with Helion Energy to secure 5GW of power.

Why SMRs? These reactors represent a fundamental rethinking of nuclear power. Unlike traditional nuclear plants, SMRs can be factory-built and transported directly to data center sites, dramatically reducing construction time and regulatory hurdles. This flexibility and scalability make them uniquely suited for the dynamic needs of data centers.

While tech giants and government initiatives paint an optimistic picture of nuclear's role in powering AI's future, some industry experts offer a stark reality check. Evan Caron of Montauk Climate (an incubator for sustainable ventures) puts it bluntly: "Small and nuclear don't work in the same sentence. Nobody wants nuclear in their backyard. So you're not going to get anyone to agree to nuclear in their backyard. It's just not going to happen."

This tension between promise and practicality defines the current state of nuclear energy in the data center industry. Despite the allure of Small Modular Reactors (SMRs), the path to implementation is fraught with challenges: extended development timelines, complex regulatory landscapes, substantial capital requirements

Established industry giants like GE-Hitachi and Westinghouse might actually be best positioned to navigate these hurdles, leveraging their decades of experience and diverse revenue streams?

Yet innovation continues. Beyond SMRs, researchers are exploring even next-generation technologies that could reshape the nuclear landscape: Liquid fluoride thorium reactors (LFTRs) and Fast breeder reactors.

Fission vs. Fusion

Nuclear power isn't one technology, but we learned is actually two distinct paths. While both harness the fundamental forces of atoms, their trajectories—and timelines—couldn't be more different.

Today, we're talking about nuclear fission ie: the emerging Small Modular Reactors (SMRs). They rely on nuclear fission—splitting heavy atomic nuclei to release energy. It's a mature technology with a proven track record of providing reliable baseload power. The timeline is concrete: we could see fission-powered data centers within the next 5-10 years.

However, in the future we might be talking about nuclear fusion which represents something more ambitious: joining light atomic nuclei to create virtually limitless, clean energy with minimal radioactive waste. We read some great research on breakthroughs at the National Ignition Facility, it still seems the path to commercialization remains long.

So for data centers and AI infrastructure, this suggests a two-phase approach: fission serving as the bridge technology while fusion develops into the long-term solution. It's not about choosing between technologies, but understanding how each fits into the broader timeline of powering our digital future.

Geothermal Energy: Tapping Earth's Heat

While the tech industry races to find sustainable power sources, an elegant solution might lie right beneath our feet. Geothermal energy—specifically geo-exchange technology—is emerging as a technology for data center sustainability, offering what solar and wind cannot: constant, reliable power.

Joselyn Lai, CEO of Bedrock Energy (a geothermal energy company), shares an exciting timeline: "Our geo-exchange technology is aiming for commercialization in the next three years." What makes this particularly compelling is its accessibility. Unlike traditional geothermal approaches that require drilling 10-15,000 feet into the Earth, geo-exchange needs only 1-2,000 feet—making it viable almost anywhere on the planet.

The science is elegantly simple: while surface temperatures fluctuate wildly, underground temperatures remain remarkably stable. Think of the Earth as a massive thermal battery, one that data centers can tap into for both cooling and heating.

This matters because cooling represents the largest energy drain in data centers after the servers themselves. Current industry standards show a Power Utilization Effectiveness (PUE) of around 1.1, with cooling being the primary obstacle to reaching the perfect score of 1.0. Geo-exchange technology could push data centers closer to this ideal efficiency.

The implications are profound: by treating the ground as a natural heat sink, data centers could dramatically reduce their energy footprint while maintaining the constant cooling crucial for AI and cloud operations.

Major tech companies like Meta and Amazon have shown interest in geothermal solutions, recognizing its potential to enhance data center efficiency and reduce overall energy footprint. Google, for instance, has invested in a project in Nevada that aims to harness geothermal energy to power its data centers near Las Vegas and Reno.

Optimizing Data Center Operations

The future of data center efficiency isn't just about finding new power sources—it's about using what we have more intelligently. AI is emerging as the maestro of this optimization symphony, conducting everything from cooling systems to power management with unprecedented precision.

Evan Caron of Montauk Climate (an incubator for sustainable ventures) frames the opportunity: "It's going to be in heavy infrastructure, Capex or in deep software and operational efficiency like coordination, orchestration, operational efficiency. Engineering feasibility, asset optionality." This isn't just about tweaking settings—it's about fundamentally reimagining how data centers operate.

The Merck-Phaidra Success Story

A compelling example comes from Merck's massive West Point, PA facility. Partnering with Phaidra (an AI optimization company), they implemented an AI Virtual Plant Operator to manage cooling across a 7-million-square-foot campus. The system oversees four interconnected chiller plants totaling 60,000 refrigerant tons—accounting for 20% of the site's energy consumption.

The results were remarkable:

  • 16.2% reduction in energy usage
  • 70.5% improvement in thermal stability
  • 50.9% decrease in excess equipment runtime

What makes this particularly impressive is Phaidra's "AI Conductor" approach—a higher-level AI coordinating four individual AI agents, each managing its own chiller plant. Think of it as an orchestra conductor ensuring each section plays in perfect harmony.

The optimization doesn't stop at cooling. Modern data centers are implementing AI-driven power management systems that treat computing power as a dynamic resource. This intelligent approach ensures that every watt of power is used effectively, whether it's coming from the grid or renewable sources.

We have a lot more to dig in here but are excited to understand data center energy usage more.


Derek Lee

Helping thought leaders tell their stories on top podcasts // Rethinking Attention and Thought Leadership // Investor, Advisor, Problem Solver

4 周

on point as always

Mitchell Hsing

CEO and Co-Founder - InchFab

4 周

Thanks Jonathan - Great article

回复
Michael Miller

Product at Robinhood

4 周

Pretty badass report, good read

回复

要查看或添加评论,请登录

Jonathan Shriftman的更多文章

社区洞察

其他会员也浏览了