The next chip arms race will be to power machine learning
Since the birth of the modern era of computing there has been an arms race by CPU microprocessor manufacturers that has pushed computer capabilities ever higher, characterized by Moore’s Law. This era’s computer technology can be characterized as running sophisticated but essentially dumb applications. A new era is beginning that will drive microprocessor manufacturers to support intelligent applications, such as based on newly emerged deep learning and other machine learning algorithms.
Deep learning is the umbrella term for a set of techniques for architecting and training neural networks that has made huge leaps forward in accuracy in recent years. For example deep learning neural networks are at the root of the most successful technologies for natural language understanding, image recognition, advanced game playing (such as Go), and more. However, deep learning requires a lot of processing power to train the neural networks.
Ovum sees the need for faster processing and higher resolution analysis will drive a new arms race for next generation microprocessors that are designed to support artificial intelligence (AI) powered applications, bringing in new players into the market.
New players are emerging to challenge traditional players
Three new players have emerged in a bid to satisfy demand for AI/machine learning/cognitive computing applications: Startups Knupath and Nervada and a major player on the software side but now entering the hardware domain: Google. They join AMD, Intel, and Nvidia. To date high-end Nvidia GPUs have dominated this nascent market for supporting intelligent applications.
Google CEO Sundar Pichai announced at the Google I/O conference in May 2016 the Tensor Processing Unit (TPU). TPU is an application specific integrated circuit (ASIC) that Google designed for its Tensor Flow deep learning algorithms. According to Google the device has a higher performance per Watt rating than nearest competitor devices such as GPUs and FPGAs by an order of magnitude (i.e. 10x). It also achieves this performance/Watt capability by reducing computational precision (e.g. single precision over double precision), requiring fewer transistors per operation. TPU powered Google DeepMind AlphaGo, which beat by 4 to 1 the world Go champion Lee Sedol in a 5-game match. TPU is available to users of Google Cloud Platform through its machine learning API but is not sold as a separate product.
Knupath is a start-up that offers the Hermosa processor, which it describes as “an architecture based on neurological design to deliver acceleration of targeted workloads”. Nervana has an ASIC named Nervana Engine, planned for launch in 2017 to power its deep learning library named Neon.
With custom chips for deep learning emerging from startups, it is likely that Google wishes to demonstrate that it is ahead of the curve, by going public about its hardware for deep learning that it has been using for some time.
Incumbent chip players continue to make progress
Earlier this year Nvidia launched its most powerful GPU designed for machine leaning applications, the Pascal GPU. AMD has announced that it developed a software library to convert Nvidia GPU CUDA software so that it runs on AMD GPUs as C++ software, named Heterogeneous-Compute Interface for Portability (HIP). Meanwhile, Intel announced next generation Xeon Phi “Knights Landing” chips that will target machine learning workloads, especially for the neural network’s training phase. According to Intel’s analysis, by optimizing deep learning libraries to run on the Xeon Phi it is able to achieve an order of magnitude higher performance than simply running the libraries ‘out of the box’.