Intel Unwraps Lunar Lake for AI PCs: new cores, new GPU, new NPU
Intel might be the last of the big four silicon providers to present this week at Computex, but they definitely aren’t going to be the least vocal. Many of the press and analyst corp has been in Taiwan with Intel for the better part of a full week, going through two days of briefings and talks about the new Lunar Lake product architecture and its plans for release. And today during the company’s keynote, they let the details out and began to talk about how it sees Lunar Lake changing the game. ?
Intel spent multiple days and seemingly 100 different sessions talking to the tech press and media about Lunar Lake, and while I plan to dive into it in more depth in a future story, it’s worth spending a bit of time here to talk about the key points that make Lunar Lake different from Meteor Lake, current shipping Core Ultra processors, and why Intel is confident that they can take on both Qualcomm and AMD in the AI PC segment that has garnered so much attention.
In short, everything changes with Lunar Lake. New core IP, new power delivery, new GPU, new NPU, new memory system; it’s kind of astounding how different this product is from previous ones. The most visible change is the move to an on-package memory system that supports LPDDR5x, four channels, and up to 32GB of total system memory. This on-package design means that Intel can save a tremendous amount of power on the PHY (up to 40% they claim) while also creating a smaller physical footprint.
The processor itself is broken up into two tiles, a compute tile and a platform controller tile. On the compute tile Intel has built a 4+4 design, with four new Lion Cove P-cores and four new Skymont E-cores. The P-cores have a significant number of architectural changes including an 18 execution port design, 8x wider prediction unit, finer clock intervals, and more. Intel claims this results in a 14% improvement in IPC compared to the Redwood Cove core on MTL.
The E-cores got even more attention this time around, with a significant upgrade that includes larger 4MB L2 cache, deeper queuing, all with the goal of providing a broader workload coverage than the previous gen. The result is a 68% improved single-threaded floating point performance vs Crestmont.
These are impressive results if they hold, and it means that Intel thinks it has a breakthrough in power and computing efficiency for x86. Clearly the company is targeting the perception that only an Arm-based design like the Snapdragon X Elite can bring the battery life and low power capabilities to compete with the likes of the Apple M-series of CPUs. We’ll be looking to see if this holds true for video playback, real-world workloads, and other uses cases.
Another reason that Intel has confidence in its power story is an improved scheduling system and new iteration of Thread Director that does more to put and keep threads on the E-cores, and in particular, FEWER E-cores. There is a point to be made here about the dual nature of the E-core and hybrid design that Intel has built; on one-hand you can use the E-cores for more multi-threaded performance in less die area for high performance parts (think higher TDP platforms or desktop systems) OR for power efficiency characteristics like the implementation we are seeing on Lunar Lake. This combined efficiency, in an example Intel highlighted, showed a Teams conferencing workload using 35% less power than in the previous methodologies.
Moving to the new GPU, this is the first instance of the new Xe2 Battlemage architecture, and Intel claims that we will see as much as 50% more graphics performance versus Meteor Lake. It adds some interesting new features that are especially interesting like XMX units, that accelerate AI functions to a significant degree, offering 67 TOPS of performance. There are new vector units, improved ray tracing units, and overall, the expectation is that the GPU on Lunar Lake will be outstanding. There was no information on the power or efficiency here, so I do believe that’s an area we’ll want to look at, but the emphasis from Intel on the GPU is strong this time around.
Other tidbits that Intel discussed include an improved video engine, of which Intel already had the industry leading integration, support a brand-new video codec called VVC, or H.266, that offers up to a 10% bitrate reduction over AV1 at the same image quality. They also integrated solid connectivity improvements with Bluetooth 5.4, Wi-Fi 7, and TBT4, all to make sure Lunar Lake is a complete platform package.
The new NPU, now called NPU 4 as it’s the 4th generation of this technology from Intel, scales from 2 neural engines to 6, increases on-chip bandwidth by 2x, and includes 12 of the SHAVE DSPs that accelerate LLM and transformer operations. The net result is a 48 TOPS integration that is obviously intentional to meet the 40 TOPS requirement of the Microsoft Copilot+ PC program launched in May.
Intel showed the NPU 4 offering up to 2x the performance at ISO power when compared to NPU 3 (back naming the NPU on Meteor Lake) but also up to 4x the peak performance thanks to the increased compute engine, MAC count and also frequency increase and baseline architecture modifications.
This brings the total platform AI capability of Lunar Lake to 120 TOPS. That’s an impressive number combined with potentially impressive power efficiency, though even Intel itself will tell you that a TOPS number is wildly ineffective at communicating real-world AI performance. Software, drivers, optimization layers and ISV / developer relations will end up making the difference between the haves and the have nots in this AI PC race.
Intel hasn’t gotten too specific on the timing of system availability, only stating that it would happen in Q3. In my conversations, Intel is adamant that Q3 will see not just some kind of “shipping” announcement or vague availability of a single SKU in China, but that you would be able to get your hands on designs by the end of September, in plenty of time for the holiday shopping season. And with all the interesting debate around what and when platforms other than the Snapdragon X Elite, will have Copilot+ features will be enabled and running, that availability window will be critically important for Intel to stay relevant and ensure there is not a mind share gap to other silicon platforms.
Lead Technical Recruiter
5 个月Just a wild question: do you know why Intel flew all popular analysts to Taipei? Almost everyone I'm watching on YouTube start by saying, "full disclosure, Intel paid for the hotel and flights, blabla". Of course, they won't say anything bad, to secure their ticket for next year.
Such a productive week so far! We're proud to finally share the details of Lunar Lake and its transformative potential.