Nvidia repositions the AI PC with performance and software tools
For being the clear leader in the AI space, Nvidia has something of an interesting problem on its hands with the AI PC movement, and some of the migration of AI compute from the data center and cloud, down to consumer devices. There is no debating the green team’s dominance in the server world with its H100 today and Blackwell-based accelerators shipping this year, and the company’s meteoric rise in market cap is a clear indicator of that. But with all the noise Microsoft and PC OEMs made last month with the launch of the Copilot+ PC as a totally new category of computing device, Nvidia has seemingly been left out in the cold.
We heard about the brand new Qualcomm Snapdragon X Elite platform that is powering systems in this AI revolution with its high performance CPU cores and 45 TOPS NPU, shipping in June, and we also expect to see the AMD Strix Point and Intel Lunar Lake platforms to offer similar Copilot+ PC ready silicon later in 2024. But when it comes to AI, Microsoft’s messaging was clear: they see the future being a lower power, high performance (and thus high efficiency) NPU, not a CPU or GPU. Where does that leave Nvidia today?
To its credit, Nvidia isn’t taking this snub sitting down, and it shouldn’t be embarrassed to talk about the fact that nearly all AI that has been happening on the PC, both from a production application standpoint and community-based experimentation, has been happening on consumer class GeForce RTX GPUs. Nvidia has built its own designation for classifying a computer’s ability to run AI, ranging from “light AI” that offers up to 45 TOPS of performance, “heavy AI” that goes up to 1300+ TOPS, and “cloud-scale AI” that goes well beyond.
The idea is pretty straightforward: a dedicated GPU like an RTX 4080 offers many times more pure AI processing capability than even the best NPU found in the X Elite processor today. Of course, that does come at the cost of additional power consumption and lower power efficiency, but the argument is that for enthusiast users (and the ones Nvidia sees as most likely to want to use AI capabilities) that kind of power consumption isn’t a problem. Either you are using a desktop system where power consumption doesn’t really matter as you’re plugged in to the wall, or you want performance, speed, and quality above any other characteristic. If you want to get real work done, do your AI processing on a GPU, claims Nvidia.
At Computex this week, as a part of many different announcements being made, the company is showcasing a handful of “GeForce RTX AI Laptops” as part of a category it calls the “RTX AI PC.” (Seems like this won’t cause any confusion down the road, right??)
The four machines highlighted here are all based on the upcoming AMD Strix Point platform and include a discrete Nvidia GeForce GPU. Interestingly, because these platforms will have an NPU capable of supporting 40+ TOPS, they can be both Copilot+ PCs and “RTX AI PCs” at the same time. Nvidia is moving ahead to label any and all PCs and laptops with an RTX GPU as an “RTX AI PC” as they integrate tensor cores for AI processing and can be used TODAY by games, creation tools, and experimental AI applications broadly.
And that application story is the second big item that Nvidia wants to highlight for its AI PC push. No one will deny that Nvidia GPUs have been the target platform for AI on the PC up until the Copilot+ PC announcement, and Nvidia is now trying to formalize that behind the NVIDIA RTX AI Toolkit name. This toolkit aims to simplify application development and ensure peak performance with a combination of model customizations and deployment methods.
Nvidia gave examples of how this toolkit allows developers to deploy AI models faster, with lower memory footprints, and with better quality results, thanks to the optimization work the company is doing. This kind of AI compression and fine tuning is required to run inference locally on machines that have an RTX 4050 mobile GPU, for example, as opposed to those limited few that can afford a desktop RTX 4090.
Another part of this AI software advantage that Nvidia is driving home is a new SDK called the AI Inference Manager. The idea here is that a single SDK can facilitate model delivery, plugins, and the ability to check and understand the capabilities of the platform its running on, to deliver the best performance and output. It even supports hybrid switching between local and cloud inference, utilizing the Nvidia NIM microservices.
And from a pure gaming perspective, there is no one in the PC space that has more clout or better relationships with developers than Nvidia. We’ve been witness to impressive demos of AI-based NPCs for a while now, and they keep getting better and more engaging. For now we are still squarely in the “demo” phases of this kind of technology, and it isn’t clear when a game of note will integrate technology like this for mainstream consumers to try, and for the industry as a whole to learn from.
To try drive more of this conversation and prove that AI and gaming can be about more than just image upscaling, Nvidia is showing a new demo of Project G-Assist, a multi-modal AI system that combines speech recognition, computer vision, and integration at the application level to create gaming experiences we haven’t seen before.
领英推荐
Nvidia gave us a sneak peek video during a pre-brief ahead of Computex, with a few examples of how this demonstration can work. Much like the Minecraft demo that Microsoft showed with OpenAI last month, you can talk to an virtual chat assistant and ask it questions about the game you’re playing, the scene you are in, and get results that are contextually relevant. You can ask the AI to automatically optimize your game settings based on the hardware you have in your machine and based on real-time telemetry as an input to the model. You could ask the AI about the best options to optimize your character’s inventory or abilities system with a certain goal in mind.
The demo video is worth a watch as it clearly shows you what the future of gaming will be, and also helps us see the future of all kinds of computing interactions with your PC thanks to AI. It’s important to note thought that this is a demo project, not intended for public release and there are no plans to productize this by Nvidia at this time. Nvidia simply needs a vehicle to showcase what you can do with high performance AI combined with gaming, and few are able to paint the vision better than them.
A couple of other highlight announcements on the AI PC front from Nvidia include the integration of Nvidia TensorRT acceleration for the ComfyUI generative AI / stable diffusion application, improving performance out of the box quite dramatically. Nvidia showed a data set comparing an RTX 4070 laptop and RTX 4090 desktop system vs two different Mac M3-based based machines, and as you’d expect the Nvidia results are impressive.
This brings TensorRT support to the top two stable diffusion applications, ComfyUI and A1111.
The other cool project that Nvidia showed me was the integration of ComfyUI into its RTX Remix tool. RTX Remix was built to allow game modders to easily change the look and feel of games by using AI-generated textures and models to update older titles. With SD tools you can upscale a texture, make it more lifelike or style it to your preference. The ability to mod a game like this using simple text prompts is incredibly powerful.
Signal Through the Noise
Nvidia will not cede the claim to the “AI PC” just because Microsoft and its OEMs are selling the Copilot+ PC starting on June 18. The company has too much invested and too much riding on the fact that it is known as THE company defining the future of AI compute to let a market like this swim past it, even if the financial implications are much smaller than we see in the data center AI world.
And Nvidia has some cards on its side here, for sure. It brought the idea of AI to the PC and to gaming with DLSS, an upscaling model that used AI-based training to improve game performance and utilize the Tensor cores on its GPUs. The world followed, with AMD and even Intel releasing their own AI-based upscalers. Nvidia then created frame interpolation using the same ideas, and the competition is following.
Content creators that are using powerful tools like DaVinci Resolve or Stable Diffusion have long known that running on a high-end GeForce GPU is the best way to get fast results, and when time is money for your professional use cases, that’s critically important. Nvidia has been at the bleeding edge of accelerating content creation through GPU compute and video processing from the very beginning, and application developers are going to continue to use the power of discrete GPUs to accelerate AI functions and features.
One interesting open question that is floating around the rumor mill is about Nvidia entering the world of PC processors again, perhaps combining an Arm-based CPU with its GeForce GPU IP to build a competitor for the Snapdragon X Elite, and the x86 options from Intel and AMD. Time will tell if that comes to pass, but it would allow Nvidia to full compete in this AI PC landscape and leave nothing in the AI world outside its grasp.
Sr. AI and HPC Software Product Management Leader
9 个月Here’s where the recent rumors of an NVIDIA PC CPU come in.
AI Experts - Join our Network of AI Speakers, Consultants and AI Solution Providers. Message me for info.
9 个月A challenging landscape indeed, but knowing Nvidia’s track record, they’ll undoubtedly adapt and innovate.
HPC Product Marketing
9 个月Great reading - can’t to read the post Computex chapter!
Chief mSoC Architect, ground up design for Automotive and Aerospace safe computing. UCIe representative. Level 2-4 Autonomous drive hardware. Jet Fighter Pilot, Oscar Bravo One.
9 个月In fact, if you try to run a LLM locally, as we will all do soon, you will realize that the 50B parameters and above models require more memory than what a 4090 has ... while the CPU has usually more than 50GB on top end PC, the top end GPU for home usage is limited to 24GB. So, in fact , Intel is already faster than NV at running Llama3 70B at 4 bit quantization ... If NV wants to be an influencer in the Local LLM game, it will have to release a suite of GPUs with much larger GPU RAM (That explains the patent I have filed about using FLASH in memory space to contain the weight of LLMs) Patent pending ??
Machine Learning Engineer - NLP & Large Language Models
9 个月??????