Slow Growth in Memory Semiconductors aka The Memory Wall
Kakao Ventures
Korea's best early stage VC. We invest in the US as well as Korea with focus on IT/software related start-ups.
Hello, this is Kakao Ventures.
Our investment team always works closely with early-stage startups while keeping an eye on market trends. In this process, we often have questions and engage in many discussions. We believe that the more diverse the thoughts and the deeper the conversations, the better. That's why we would like to share some of our thoughts with you one by one in the future. We hope this will be of some help to entrepreneurs, investors, or anyone interested in the market.
Why Superconductors Are a Myth... So Far!
[Deep Tech Stories with Zero] 2. The Shift to Next-Generation Semiconductors
Written by Youngmoo Kim
NVIDIA is a rapidly growing company, mainly focused on AI semiconductors for data centers. While they also sell GPUs for gaming, the majority of their revenue comes from AI chips for data centers. With the explosive popularity of large language models (LLMs), NVIDIA is enjoying an incredible wave of momentum, with no signs of slowing down. In fact, this past June they scored the top spot in global market capitalization.
Meanwhile, behind the scenes of this AI semiconductor market for data centers, there’s another scene unfolding - fierce competition to win orders from NVIDIA.
Leading this battle are memory semiconductor companies such as SK Hynix, Samsung Electronics, and Micron, all competing in the HBM (High Bandwidth Memory) space.
In previous articles, we discussed next-generation semiconductors that emerged through new materials and devices. HBM is one such next-generation semiconductor that has come to the forefront, largely due to the ongoing market transformation.
In this piece, we’ll revisit how HBM came to receive such attention.
1. The Slow Growth of Memory Semiconductors: The “Memory Wall”
To understand how HBM became so significant, we need to first look at the concept of the “Memory Wall.”
What is the Memory Wall?
Semiconductors can be broadly categorized into two types: memory and non-memory semiconductors.
Simply put, non-memory semiconductors handle computation, while memory semiconductors handle storage. The CPU represents non-memory semiconductors.
This separation of storage and computation is known as the von Neumann architecture.
The problem arises because the development speed of memory semiconductors lags far behind that of CPUs.
As computational power has steadily increased, memory bandwidth has failed to keep pace. As shown in the chart, the development speed of widely used DRAM (Dynamic Random Access Memory) has significantly diverged from that of CPUs since the 1990s.
The reason lies in the structural differences between memory and non-memory semiconductors.
In the diagram above, the left side represents the structure of a DRAM cell, while the right side shows the structure of a NAND gate in a non-memory semiconductor.
At first glance, the NAND gate on the right looks more complex, so one might wonder why memory semiconductor development is slower.
However, the key lies in the capacitor within the DRAM structure.
The left-side DRAM structure consists of one transistor and one capacitor, while the NAND gate in non-memory semiconductors is made up of four transistors.
As mentioned in previous articles, there’s intense competition to improve transistor density using advanced processes like 5nm and 3nm nodes.
Then what about capacitor density?
Unfortunately, due to the nature of capacitors, it’s much harder to miniaturize them compared to transistors.
This means that even if the transistor density in DRAM increases, the presence of the capacitor limits performance improvements. As a result, while CPU processes are pushing into the 3nm range, memory semiconductors are still reliant on conventional processes above 10nm. This discrepancy has created a bottleneck in computing performance known as the “Memory Wall.”
2. Overcoming the Memory Wall: The Demand from LLMs and Current Limitations
Up until recently, the Memory Wall wasn’t a pressing issue.
However, with the advent of LLMs, it has become an immediate challenge that needs to be addressed. New performance enhancement methodologies are now required.
The development of memory, particularly DRAM, has largely focused on LPDDR products. LPDDR stands for “Low Power Double Data Rate,” meaning it allows twice the data transfer with lower power consumption.
Why is low power consumption significant in this context?
Inside a computer, there are countless transistors and other components, so it’s essential to transfer information smoothly between these components in a complex structure. This is where the concept of the clock comes in.
Imagine three people—A, B, and C—playing a game of catch on a beach. If A throws a ball to B, but B doesn’t pass it to C in time, then B won’t be able to catch the ball from A. Thus, A, B, and C need to throw and catch the balls in sync at predetermined intervals.
Computers work similarly: they process and transfer data in sync, with the clock governing the timing.
This clock keeps on turning on and off. In this context, SDR (Single Data Rate) means passing the ball only when the clock is “on,” while DDR (Double Data Rate) passes the ball both when the clock is “on” and “off.”
This allows DDR to transfer twice the data per clock cycle compared to SDR.
Of course, doubling the data transfer also increases energy consumption - your arms are going to hurt if they have to continuously relay the ball - which is why low-power DDR is essential.
LPDDR has improved performance by increasing the frequency of data transfers per second while consuming less power.
The problem is, as the number of data transfers increases and the process becomes faster, the likelihood of errors also rises, and handling the load becomes more difficult. Your arms are likely to tire and grow sore. Moreover, as process scales shrink and density increases, the pathways for transferring data become narrower. Essentially, we are trying to pass millions of balls per second through a needle-sized hole.
Initially, companies like NVIDIA, Rebellion, and Furiosa AI used DDR-type memory, not HBM. However, as AI models have grown too large and DDR-type memory has reached its limits, handling these massive workloads became unmanageable.
领英推荐
This led to the debut of HBM as a new semiconductor in the market.
3. HBM: The Savior to Overcome the Memory Wall
In my previous articles I’ve emphasized that in the area of semiconductors, the era of 3D stacking has arrived due to the limits of 2D integration.
HBM follows this same principle.
Instead of increasing density horizontally, memory semiconductors are now being stacked vertically. The latest HBM stacks up to 12 layers, with plans to reach 16 layers in the next 2–3 years.
SK Hynix has been researching HBM for over a decade, unlike Samsung which paused its HBM research at some point. Now SK Hynix holds a 53% market share, forming strategic alliances with TSMC and NVIDIA.
While competitors like Samsung and Micron are catching up, they have repeatedly failed to pass NVIDIA’s tests, leaving them at a disadvantage for now.
When a winner emerges from this battle, can HBM fully overcome the challenges posed by LLMs?
I might be biased towards having higher expectations but it still feels somewhat lacking.
Users are experiencing lag in LLMs as they sputter during text and image generation, while on the other end, cloud companies are grappling with massive data center operations and energy consumption issues.
4. Can HBM Fully Conquer LLMs?
To overcome the current challenges, how must HBM performance be improved?
First let’s look at the structure of HBM: DRAMs are stacked like apartment complexes on top of a Substrate, which itself rests atop an Interposer. Vertical pathways called TSVs (Through-Silicon Vias) run through the DRAMs.
These pathways allow data to be transferred much faster, like express elevators that run through the DRAM. These ‘express elevators’ number 1,024 right now, and they’re expected to go up to 2,048 pretty soon.
Interposers, which sit beneath the memory layers, manage these express elevators. These are not typical plastic PCB boards but rather new forms of boards, which is why post-processing and glass boards made of new materials are gaining attention.
Thanks to TSVs and interposers, HBM can handle massive amounts of memory.
Think of it as increasing the number of hands throwing balls from 2 to 1,024, while stacking up to 12 or even 16 floors of humans tossing the balls.
In summary, HBM performance improves by 1) increasing memory capacity through more layers and 2) widening memory bandwidth by adding more TSVs.
However, this is just a start to the vast potential for performance improvement within HBM’s 3D integration structure. More TSVs need to be added, and more layers need to be stacked.
As the number of TSVs increases, the memory grows larger and? pathways more complex, which means the Interposer must become smarter to manage these express elevators. There are also physical limits to how much space is available within the HBM - one cannot just bore tunnels indefinitely. That would be like Swiss cheese.
Even if we manage to stack higher, problems will arise.
HBMs don’t begin by stacking vertically from the very beginning. First the DRAM has to be conjured on 2D and then multiples of them can be stacked. Think of it like building several single-floor houses and then layering them on top of each other. With the current structure, it would be physically very challenging to construct more than 16 layers.
Even if that does happen by some chance, the problem is TSVs.
Currently, lasers are used to drill vertical holes in the DRAM layers, but as we surpass 16 layers, it becomes increasingly difficult to drill uniform holes.
And even if, by some minute chance, uniform pathways are created, the next problem is power consumption and heat dissipation.
As one can see from its structure, HBM has evolved into a form where the physical size of the memory semiconductors has grown. As more components are packed into a predetermined structure, it’s inevitable that power consumption and heat generation will increase proportionally.
Technologically speaking, HBM still has a loooong way to go.
So… does this mean the end of technological progress?
Of course not.
The reason I point out limitations in the tech market isn’t to suggest that these challenges are insurmountable.
Rather, it’s to emphasize that methods to overcome these limits are being intensely researched, and new market opportunities arise from efforts to push past these barriers.
Challenges in the market are actually a good thing.
In Won-young’s perspective, it’s “totally lucky” because it creates momentum for everyone to work on solving a shared problem.
The limitations of HBM will, once again, open up new market opportunities.
New materials may emerge that are favorable for vertical stacking; CFET technology, which stacks transistors vertically, could make advancements by leveraging foundational stacking technologies. Or we might see the introduction of innovative new interposers or low-power memory devices.
Furthermore, even within deep-tech startups, those armed with strong momentum can survive the fierce battles of the tech giants by anchoring themselves with proprietary technology or sharp solutions.
Additionally, having a long-term vision and solid faith that things will work out will be crucial for deep-tech startup founders.
SK Hynix, despite negative market sentiment around HBM, has maintained its edge by consistently researching the technology with a long-term vision and conviction. In the end, the teams that stay true to the essence of innovation, rather than making short-term changes for immediate gains, will emerge victorious.
It’s hard to fully appreciate the long-term vision and deep conviction of those who have ventured into the market, facing the daunting hurdles of technology development, and founded proactive, ambitious startups amidst challenges.
We intend to bet on the vision and conviction **of these innovators perhaps even more so than the sales and profits that will follow. The goal is to invest in startups that tenaciously establish themselves like immovable rocks, armed with inimitable proprietary technology and razor-sharp solutions.
The next topic I’d like to share is <4. How NVIDIA Establishes Its Hegemony>.
Thank you.