课程: Tech Trends

NPUs vs. GPUs vs. CPUs

课程: Tech Trends

NPUs vs. GPUs vs. CPUs

- Computing got a major upgrade, and it's because AI is revolutionizing processing with the integration of a unique chip, the neural processing unit. Let me explain. There's been several shifts in computing architectures over time, starting with the humble CPU. The CPU is the brain of every computer, translating instructions from higher level languages into machine code that hardware can understand, while also managing interactions between systems in the computer. Over time, CPUs rapidly gained features and capabilities like multiple cores, faster clock speeds, and improve power efficiency. With the rise of 3D and computer gaming, the need for specialized processors developed. That's when GPUs, or graphic processing units, appeared as special assistants to the CPUs. The differences between CPUs and GPUs are the types of math problems they're customized to solve, like matrix operations, vector calculations, and floating point arithmetic. Modern CPUs often have 4 to 16 cores, while GPUs can have thousands. Although each GPU core is less powerful than a CPU, some features of GPUs like their abilities to handle massive amounts of parallel processing made AI engineers regard GPUs with envious eyes. So they drew their plans to adopt GPUs as the primary way to process AI tasks, and that worked for the most part. Eventually, just as GPUs were built to offload the graphic tasks they excelled at, neural processing units excel at AI calculations like matrix multiplication, low latency, and high throughputs. Google calls their NPUs tensor processing units, and Apple calls them a neural engine, while Microsoft recently announced Copilot+ PCs that can handle 45 trillion operations per second. But there's another reason why NPUs are fundamental to AI computing, and that's the rise of edge computing. So far, models like GPT or Gemini process prompts in the cloud, which can be expensive and expose information on your local machine to the internet. With NPUs, your local machine can run a small language model like Phi or Gemma. Although these are not as capable as large language models, they can do a lot of work while having access to your local machine's context. The small language models can offload more complex tasks to the cloud when necessary while keeping your private information private. As a bonus, with this new architecture, AI can become more efficient, saving energy, and providing greater capabilities. That's a win-win for developers as well as the future of computing. What will they think of next?

内容