The future shape of Semiconductor
Ashutosh K.
Ex banker, Now self-employed, MD &CEO of Kumar Group of companies, Author of many books.
WHAT WILL BE THE SHAPE OF THE SEMICONDUCTOR TECHNOLOGY LANDSCAPE
Foreword
?The science and Technology sector or industry has always been dominated by scientists who are having the best brain of global human beings. Having tremendous patience and consistently working hard on some new concepts, and if we trust history, they have invented so many inventions which have made us more civilized and products that have made us comfortable. Each day we see improvement in technology and the development of new technology. A semiconductor is the soul of any electric and electronic product which we use in our day-to-day life.
?Introduction
The growth of the global semiconductor industry in the past few decades has been driven largely by the demand for cutting-edge electronic devices such as desktops, laptops, and wireless communication products, and by the rise of cloud-based computing which will continue with new application drivers for the high-performance computing market segment.
?As the amount of data keeps on growing astounding very rapidly, a swing has been given momentum after the launch of 5G networks. The requirement for servers surged tremendously in order to process and store the data traffic. ??. As per the 2020 Yole report, a compound annual growth rate of 29% is expected for the high-end central processing units (CPUs) and graphical processing units (GPUs) that are at the core of these servers. They will support a host of data center applications, such as supercomputing and high-performance computing as a service. A faster growth rate is anticipated for GPUs – triggered by emerging applications such as cloud gaming and artificial intelligence. Recent corona-related remote work and education has also been playing their presence on the internet traffic. As per the past year’s trend, internet traffic enlarged by nearly 70% and commercial internet data exchange in Frankfurt set a?history for data throughput at more than 9.1 Terabits per second.
?A second main driver is mobile systems-on-chips (SoCs) – the chips in our smartphones. This market segment is not growing rapidly but the demand for more functionality in these SoCs in form-factor constrained chip areas will drive further technology innovations. In view of the above, beyond the outdated dimensional scaling of logic, memory, and 3D interconnects, these emerging applications will need to leverage cross-domain innovations. There is a need for new modules, new materials, and architecture changes at the device, block, and SoC levels to realize the benefit at a system level.
?A semiconductor is one of them which has tremendous scope of development. Really as we recall ?Moore’s Law is at least today not being relevant by the postulation of some other law which can better be comprehended than it. How are ultra-scaled technologies improve in the future will mostly depend on the latest progress of semiconductors? How do semiconductors help to manage the overwhelming flow of data to the data center? ?Can semiconductors be capable to break the memory wall in traditional Von Neumann computing architectures?
?The last couple of years has remained very exciting for the scientists who are working on the semiconductor observing that they are very near to making a major transformation that will enhance the potency of semiconductor very soon in near future.
??We may observe some of the major trends shaping the future that will initiate current and future semiconductor technology development. ?These are as under:
.The relevance of ?Moore’s Law
CMOS transistor density scaling will be assumed to remain to follow Moore’s Law for the next decades. This will be enabled mainly by the improvement of EUV designing and by the emergence of novel device architectures which will enable logic standard cell scaling. Extreme ultraviolet (EUV) lithography was introduced in the 7nm technology node to pattern some of the most critical chip structures in one single exposure step. Beyond the 5nm technology node (i.e., when critical back-end-of-line (BEOL) metal pitches are below 28-30nm), multi-patterning EUV lithography becomes inevitable adding handsomely to the thin cost.
?Image contributes to advancing EUV lithography for example by investigating stochastic defectivity. Stochastic printing failures are random, non-repeating, isolated defects such as microbridges, locally broken lines, and missing or merged contacts. Improvement in stochastic defectivity could lead to the use of lower dose exposures and thus improve throughput and cost. We try to understand, detect and mitigate stochastic failures which could recently report an order of magnitude improvement in stochastic defectivity.
To give an impetus to accelerate the overview of high-NA EUV, we are installing Attolab that helps to permit to test some of the critical materials for high-NA EUV (such as mask absorber layers and resists) before the high-NA tool will be available. The spectroscopic characterization tools in this lab will allow us to look at crucial EUV-photon reactions with resists at attosecond timeframes, which are also relevant to understanding and mitigating stochastic defect formation. Today, we have successfully completed phase one of the Attolab installation and expect to have high-NA EUV exposures in the coming months.
Apart from advancements in EUV lithography, Moore’s Law cannot continue without innovations in the front-end-of-line (FEOL) device architecture. Today, FinFET devices are the mainstream transistor architectures, with the most advanced nodes having 2 fins in a 6-track (6T) standard cell. However, scaling down FinFETs to 5T standard cells results in fin depopulation with only 1 fin per device in the standard cell, causing a dramatic drop in the device performance per unit area. Vertically stacked nanosheet devices are considered as the next-generation device, being a more efficient use of the device footprint. Another critical scaling booster is the buried power rail (BPR). Buried in the chip’s FEOL instead of in the BEOL, these BPRs will free up interconnect resources for routing.
Scaling nanosheets into the 2nm generation will be limited by n-top space constraints. It is envisioned the worksheet architecture as the next-generation device. By defining the n-to-p space with a dielectric wall, the track height can be further scaled. Another standard cell architecture evolution that will help with routing efficiency is a vertical-horizontal-vertical (VHV) design for metal lines, as opposed to traditional HVH designs. Ultimate standard cell scaling down to 4T will be enabled by complementary FETs (CFETs), that fully exploit the third dimension at the cell level by folding n-FETs over p-FETs or vice-versa.
?Logic performance improvement at fixed power will slow down
With the above-said innovations, ?the path was mapped out by Gordon Moore. But node-to-node performance improvements at fixed power – referred to as Dennard scaling – have slowed down due to the inability to scale supply voltage. Researchers worldwide are looking for ways to compensate for this slow-down and further improve the chip’s performance. The aforementioned buried power rails are expected to offer a performance boost at the system level due to improved power distribution. Besides, other looks at incorporating stress into nanosheet and worksheet devices, and at improving the contact resistance in the middle-of-line (MOL). Further out, the sequential CFET device will provide the flexibility for incorporating high mobility materials since the n-device and p-device can be optimized independently.?
2D materials such as tungsten disulfide (WS2) in the channel promise performance improvements because they enable more aggressive gate length scaling than Si or SiGe. A promising 2D-based device architecture involves multiple stacked sheets each surrounded by a gate stack and contacted from the side. Simulations suggest these devices can out-perform nanosheets at scaled dimensions targeting 1nm node or beyond. Dual-gate transistors with bilayer WS2 on 300mm wafers have already been demonstrated, with gate lengths down to 17nm. To further improve the drive current of these devices, there is a need to strongly focus on improving the channel growth quality, incorporating dopants, and improving contact resistance in these novel materials.
??To improve the via resistance, ?hybrid metallization using Ru or Mo. It is expected semi-damascene metallization modules to simultaneously improve resistance and capacitance in the tightest pitch metal layers. Semi-damascene will allow us to increase the aspect ratio of the metal lines (to lower resistance) by direct patterning and use airgaps as a dielectric in between the lines. At the same time, it can be screened a variety of alternative conductors like binary alloys as a replacement for ‘good old’ Cu, to further reduce the line resistance.
?More heterogeneous integration, enabled by 3D technologies
In industry, we see more and more examples of systems being built through heterogeneous integration leveraging 2.5D or 3D connectivity. These options help address the memory wall, add functionality in form-factor constrained systems or improve yields on large chip systems. With the slowing logic PPAC (performance-power-area-cost), smart functional partitioning of SoC (system on chip) can provide another knob for scaling. A typical example is high-bandwidth memory (HBM) stacks, consisting of stacked dynamic random access memory (DRAM) chips that connect directly through a short interposer link to a processor chip, such as a GPU or CPU. More recent examples include die-on-die stacking in Intel’s Lakefield CPU or chiplets on the interposer in the case of AMD’s 7nm Epyc CPU. In the future, we expect to see many more of these heterogeneous SoCs – as an attractive way to improve system performance. .).
领英推荐
In order to connect the technology options to the performance at the system level, we have set up a framework called S-EAT (System benchmarking for Enablement of Advanced Technologies). This framework allows us to evaluate the impact of specific technology choices on system-level performance.
As an illustration, we have used this platform to find the most optimal partitioning of a high-performance mobile SoC containing a CPU and L1, L2, and L3 caches. In a traditional design, the CPU would reside next to the caches in a planar configuration. We assessed the impact of moving the caches to another chip, stacked with 3D wafer bonding techniques to the CPU chip. As the signals between cache and CPU now travel shorter distances, an improvement in speed and latency can be expected. The simulation experiments concluded that it was most optimal to move L2 and L3 caches to the top tier instead of L1 only or all 3 caches simultaneously.
To enable partitioning at these deeper levels of the cache hierarchy, a high-density wafer-to-wafer stacking technology is required.?We have demonstrated wafer-to-wafer hybrid bonding at 700nm interconnect pitch and believe the advancements in bonding technology will enable 500nm pitch interconnects in the near future.
Heterogeneous integration is enabled by 3D integration technologies such as die-to-die or die-to-Si-interposer stacking using Sn micro bumps or die-to-silicon using hybrid Cu bonding. The state-of-the-art Sn micro bump pitches in production have saturated at about 30μm. It has been demonstrated an Sn-based micro bump interconnect approach with interconnecting pitch down to 7μm. Such high-density connections leverage the full potential of through-Si via technology and enable >16x higher 3D interconnect densities between die or between dies and a Si-interposer.?This allows for a strongly reduced SoC area requirement for the HBM I/O interface (from 6 down to 1 mm2) and potentially shortens the interconnect lengths to the HBM memory stack by up to 1 mm. Direct bonding of die to silicon is also possible using hybrid Cu bonding. We are developing die-to-wafer hybrid bonding down to 3μm pitches with high tolerance pick and place accuracy, leveraging the learning from wafer to wafer hybrid bonding.
As SoCs are becoming increasingly more heterogeneous, the different functions on a chip (logic, memory, I/O interfaces, analog, ...) need not come from a single CMOS technology. It may be more advantageous to use different process technologies for different sub-systems to optimize design costs and yield. This evolution can also answer the need for more chip diversification and customization.
?NAND and DRAM ?
Emerging non-volatile memories on the rise
NAND storage will continue to scale incrementally, without disruptive architectural changes in the next few years. Today’s most advanced NAND products feature 128 layers of storage capability. The 3D scaling will continue with additional layers potentially enabled by wafer-to-wafer bonding. Image contributes to this roadmap by developing low resistance word-line metals like ruthenium, researching alternate memory dielectric stacks, improving channel current, and identifying ways to control the stress that evolves due to the growing number of stacked layers.
For DRAM, cell scaling is slowing down, and EUV lithography may be needed to improve patterning. ?Besides exploring EUV lithography for patterning critical DRAM structures, inc provides the building blocks for true 3D DRAM solutions. And this starts with putting the memory array on top of the periphery. Such an architecture requires a low thermal budget deposited semiconductor for the array transistors. And this is where the low-temperature IGZO (or indium-gallium-zinc-oxide) family of transistors enter the scene. ?Ultimate 3D DRAM implementation will also require these materials to be deposited on topography. This drives the need for atomic layer deposition (ALD) for layer formation.
?In the embedded memory landscape, there are significant efforts to understand the so-called memory wall: how quickly can the CPU access data from DRAM or from SRAM-based caches? How do you ensure cache coherency with multiple CPU cores accessing a shared cache? What are the bottlenecks that limit speed, and how can we improve the bandwidth and data protocols that are used to fetch the data??Imec deploys its system-level simulator platform S-EAT to gain insights into these bottlenecks. This framework also allows for the evaluation of novel memories as SRAM replacements to understand the system performance for various workloads. We are studying various kinds of magnetic random access memories (MRAM) including spin-transfer torque (STT)-MRAM, spin-orbit torque (SOT)-MRAM, and voltage-controlled magnetic anisotropy (VCMA)-MRAM) to potentially replace some of the traditional L1, L2, and L3 SRAM-based caches. Each of these MRAM memories comes with its own benefits and challenges and may help us overcome the memory bottleneck by improving speed, power consumption, and/or memory density.
THE GREAT RISE OF THE EDGE AI CHIP INDUSTRY
Edge AI is one of the biggest trends in the chip industry. As opposed to cloud-based AI, inference functions are embedded locally on the Internet of Things (IoT) endpoints that reside at the edge of the network, such as cell phones and smart speakers. The IoT devices communicate wirelessly with an edge server that is located relatively close. This server decides what data will be sent to the cloud server and what data gets processed on the edge server.
Compared to cloud-based AI, in which data needs to move back and forth from the endpoints to the cloud server, edge AI addresses privacy concerns more easily. It also offers advantages of response speeds and reduced cloud server workloads. Just imagine an autonomous car that needs to make decisions based on AI. As decisions need to be made very quickly, the system cannot wait for data to travel to the server and back. Due to the power constraints typically imposed by battery-powered IoT devices, the inference engines in these IoT devices also need to be very energy efficient.
?Now, commercially available edge AI chips, the chips inside the edge servers offer efficiencies in the order of 1-100 tera operations per second per Watt (Tops/W), using fast GPUs or ASICs for computation. For IoT implementations, much higher efficiencies will be needed. Imec’s goal is to demonstrate efficiencies for inference in the order of 10,000 Tops/W. The new method breaks with the traditional Von Neumann computing paradigm, which is based on sending data from memory to a CPU (or GPU) for computations. With analog compute-in-memory, computation is done inside a memory framework, saving a lot of power in moving data back and forth.
?CONCLUSION
In crisp, highlighted below:
ü Imec, signify the gain of 2D-SOC design and backside interconnects for future high-performance systems.
?ü Imec, carries the magnetic domain wall devices closer to industrial reality
?ü A bilinear 2D device style for large-scale silicon-based quantum computers
?ü Apple partnership with the Imec research program will enable the entire semiconductor value to reduce its ecological footprint
?The semiconductor transformation will change the dependent technology at stellar height. The future will produce various innovative products due to future semiconductors.