When Gaudi Met Llama: Intel's Love Story with AI

When Gaudi Met Llama: Intel's Love Story with AI

Today, Intel showcased its latest Gaudi AI accelerators, the third generation, at its Vision 2024 event in Phoenix. With Intel's extensive operations in nearby Chandler, the big questions now are about production, pricing, and shipping times for these new Gaudi 3 accelerators. In collaboration with Taiwan Semiconductor Manufacturing Co (TSMC), Intel is the only one who knows how many Gaudi 3 units it can produce.

Following Intel's $2 billion acquisition of Habana Labs in December 2019, the Gaudi series has been a key player in AI technology. The first Gaudi accelerator was introduced in July 2019, when Nvidia's "Volta" V100 led the relatively smaller AI market.

Now, with the Gaudi 3, Intel aims to be more assertive in production and sales despite facing tough competition from Nvidia regarding accelerator performance. Intel has a chance to attract developers who use PyTorch and Llama LLMs among the countless AI models available on Hugging Face. For Intel to establish a strong presence in the AI accelerator market, it must assure customers that the Gaudi 3 shares enough architectural features with the upcoming "Falcon Shores" hybrid CPU/NNP design, expected in late 2024 or early 2025.

Falcon Shores aims to blend the Gaudi line with the Max Series GPU line, incorporating features from both to create a powerful new GPU design. Pricing for the Gaudi 3 will depend on its performance comparison with Nvidia's "Hopper" H100 GPU and the market price of those units.

As Nvidia introduces more advanced models, Intel must adjust its pricing strategy accordingly. Despite potential timing issues and pressure from competitors like Nvidia and AMD, the Gaudi 3 is a significant upgrade from its predecessor, the Gaudi 2, especially in its design.

According to Eitan Medina, COO of Habana Labs and now at Intel, the Gaudi 3 has seen a 50% increase in Tensor Processor Cores (TPCs) and a 4x increase in Matrix Multiply Engines (MMEs) compared to the Gaudi 2. The Gaudi 3 features two identical chiplets, each supporting various functions, including Ethernet ports and memory engines. It boasts 48 MB of SRAM per tile, leading to 96 MB of SRAM with high bandwidth capabilities. Additionally, it includes eight HBM2E memory stacks, offering 128 GB of capacity and 3.7 TB/sec of bandwidth.

Gaudi 3 continues to support a range of data formats like its predecessor and integrates Ethernet technology to facilitate cluster creation without needing InfiniBand. An eight-way Gaudi 3 node can deliver significant computing power, closely competing with Nvidia's Hopper H100 nodes.

As systems scale, Gaudi 3 nodes connect using high-speed links to form large networks, ensuring efficient data flow across many accelerators. This network architecture allows for creating vast clusters of Gaudi 3 accelerators, demonstrating Intel's commitment to pushing the boundaries of AI acceleration technology.

The introduction of the Gaudi 3 accelerators at Vision 2024 is a testament to Intel's enduring commitment to advancing AI technology. By pushing the limits of AI accelerator performance and architectural design, Intel is not just challenging the status quo; it's setting the stage for a new era of computing. As the industry looks forward to the release of Falcon Shores and beyond, Intel's role in shaping the future of AI acceleration—and its potential to revolutionize how we harness the power of artificial intelligence—has never been more apparent.

要查看或添加评论,请登录

Tony Grayson的更多文章

社区洞察

其他会员也浏览了