登录查看更多内容

The Deep Learning Hardware Battle

Aditya Kaul

Helping enterprise ascend the intelligence scale

发布日期: 2016年9月6日

Originally published on Tractica Blogs

There is an ongoing race among semiconductor companies, including the established market heavyweights and startups alike, to define the hardware platform that will run compute-intensive deep learning algorithms quickly and efficiently. Until now, NVIDIA has dominated the deep learning market with its graphics processor unit (GPU) chips, which bring massive parallelization, however field programmable gate arrays (FPGAs) and digital signal processors (DSPs) are starting to catch up. Deep learning is largely characterized by deep neural networks (DNNs) and convolutional neural networks (CNNs), which can become massively complex. Google’s cat recognition neural network back had 1 billion connections using 16,000 processors. GPUs are known to achieve the best speed and throughput, around 100x faster compared to an FPGA, while FPGAs are known to have better power efficiency, around 50x better compared to a GPU. This illustrates the tradeoff we see between GPUs and FPGAs when running high-intensity deep learning algorithms. Microsoft has already extended its use of FPGAs for deep learning algorithms, where it makes up for GPU performance gap using scale, bundling multiple FPGAs together. A more detailed look at FPGAs versus GPUs for deep learning was covered in an earlier Tractica blog post. Since then, we have seen FPGAs gain more traction, with startups like DeePhi banking on the fact that deep learning requires changeable workloads depending on the type of neural network. Another startup called Knupath is building a custom DSP chip for deep learning and machine learning applications, with plans to integrate FPGAs on its roadmap. Knupath is targeting a specific problem area in high-performance computing, called sparse matrix based problems.

The majority of the hardware focus for deep learning has been on cloud server computing, both for training and execution of deep learning models. Intel has been trying to build up its deep learning capabilities with its recent Nervana acquisition and its Xeon Phi Knights Landing announcement, in an effort to compete with NVIDIA. Google has also put its foot into the market with its Tensor Processing Unit (TPU) hardware platform, which was used for DeepMind’s AlphaGo triumph. In addition, a UK-based startup called Graphcore has recently come out of stealth mode and is focused on building a neural network accelerator called the Intelligent Processor Unit (IPU).

At the same time, we are also seeing the algorithm space evolving rapidly, with algorithms moving to using more temporal (or memory intensive) architectures, such as recurrent neural networks (RNNs). Deep learning algorithms that focus more on context, like the placement of an object in an image, or word in a sentence, are expected to see increasing usage to improve their effectiveness. We have also seen a lot of excitement around generative adversarial networks (GANs) and one-shot learning, all of could require different hardware processing architectures, or a combination of architectures, compared to the current dominant deep learning method, CNNs. Therefore, the hardware market for deep learning might end up with multiple “bespoke” architectures rather than “one size fits all.”

In conjunction with the rapidly evolving algorithmic space, another shift that is beginning to happen is with deep learning execution (while training still remains in the cloud) starting to move to the device, where power consumption comes at a premium. Apple’s recent announcement of neural network libraries available for iOS, and Samsung’s M1 architecture revealing a neural network predictor illustrate the growing shift toward neural network processors on devices. The on-device approach has major applications in drones, robots, autonomous vehicles, and smartphones. The biggest drivers for neural networks on devices are image recognition and natural language processing (NLP) to be done on the fly. Qualcomm has shown its Zeroth platform being able to perform on-device deep learning using existing Snapdragon processors. Qualcomm’s focus right now is to extract optimizations in existing hardware and software to perform on-device deep learning. However, in the long run, Qualcomm and other device makers are likely to move toward a neural network processor (NNU) architecture which will be dedicated to running and accelerating on-device deep learning, in addition to existing central processing units (CPUs), GPUs, and DSPs. The evolving nature of deep learning algorithms and workloads could also have a role to play in how the processing is distributed between the cloud and the device, and ultimately which architecture is best suited.

While hardware vendors are keen to get ahead of the game in terms of speeding up and powering down server-based deep learning, the ultimate prize is in defining and establishing itself in the high-volume on-device deep learning processor market.

要查看或添加评论，请登录

Aditya Kaul的更多文章

The Intelligence Explosion and Its Impact On The Enterprise AI Landscape

2023年10月20日

The Intelligence Explosion and Its Impact On The Enterprise AI Landscape

Disrupting Proprietary AI, Tech Stacks, Cloud Compute and Data Pipelines In a recent Dwarkesh Patel podcast spanning a…
India’s AI Ambitions Outweigh Its Ability

2018年9月6日

India’s AI Ambitions Outweigh Its Ability

The Indian AI market is mostly nonexistent today, with a handful of activity in specific areas like finance, IT…

2 条评论
China, AI Edge Hardware, and Governance

2018年7月17日

China, AI Edge Hardware, and Governance

China is accelerating its investment in artificial intelligence (AI) and has an ambitious plan to have its core AI…
AI Hitting Headwinds in 2018

2018年4月3日

AI Hitting Headwinds in 2018

The year 2018 has been a tough one, so far, for AI. If AI had a public relations (PR) or brand consultant, they would…
Artificial Intelligence Processing Moving from Cloud to Edge

2017年6月9日

Artificial Intelligence Processing Moving from Cloud to Edge

The recent rise of artificial intelligence (AI) can be partly attributed to improvements in graphics processing unit…

1 条评论
Enterprises Still Making Sense of Artificial Intelligence as They Deal with Digital Transformation

2017年5月17日

Enterprises Still Making Sense of Artificial Intelligence as They Deal with Digital Transformation

Tractica spent two engrossing days at the AI Summit in London last week. Clint Wheelock, Managing Director at Tractica,…

1 条评论
CES 2017: Artificial Intelligence’s Ubiquity and Invisibility

2017年2月3日

CES 2017: Artificial Intelligence’s Ubiquity and Invisibility

The Consumer Electronics Show (CES) is a gadget show at its heart, showcasing the latest in next-generation…
Maluuba Is Microsoft’s DeepMind with a Commercial Tilt

2017年1月18日

Maluuba Is Microsoft’s DeepMind with a Commercial Tilt

Microsoft kicked off 2017 by acquiring Canadian artificial intelligence (AI) startup Maluuba. Many more AI acquisitions…
OpenAI, Strong AI Training Environments, and Cloud Robotics

2016年12月21日

OpenAI, Strong AI Training Environments, and Cloud Robotics

The non-profit artificial intelligence (AI) research company, OpenAI, has released a software environment called…
How Will Artificial Intelligence Impact Jobs?

2016年11月15日

How Will Artificial Intelligence Impact Jobs?

There is growing concern about artificial intelligence (AI) taking away jobs. The arguments can be categorized into…

See all articles

The Deep Learning Hardware Battle

Aditya Kaul

Helping enterprise ascend the intelligence scale

Aditya Kaul的更多文章

社区洞察

其他会员也浏览了

Artificial Intelligence #43: How do AI chips / neural chips work

The Nvidia Gen AI LLM Certification Journey

AI Leap Forward: WizardLM Excels, SDXL 1.0 Transforms, Neural Magic Elevates!

NVIDIA Mixed Precision & Power Consumption - Part 1

Intel? oneAPI Perfomance Libraries- Part 2

???? Exploring the Evolution of GPU Architectures in Deep Learning: A Technical Odyssey ????

Machine Learning, Deep Learning and AI in Consumer Electronics (CE) - Let′s talk about CES and ICCE in Las Vegas

Software Eats Hardware: The Sequel

Will NPUs replace CPUs and GPUs

Empowering Data Scientists: The Game-Changing Role of GPUs in Deep Learning

Aditya Kaul的更多文章

The Intelligence Explosion and Its Impact On The Enterprise AI Landscape

India’s AI Ambitions Outweigh Its Ability

China, AI Edge Hardware, and Governance

AI Hitting Headwinds in 2018

Artificial Intelligence Processing Moving from Cloud to Edge

Enterprises Still Making Sense of Artificial Intelligence as They Deal with Digital Transformation

CES 2017: Artificial Intelligence’s Ubiquity and Invisibility

Maluuba Is Microsoft’s DeepMind with a Commercial Tilt

OpenAI, Strong AI Training Environments, and Cloud Robotics

How Will Artificial Intelligence Impact Jobs?

社区洞察

其他会员也浏览了

Artificial Intelligence #43: How do AI chips / neural chips work

The Nvidia Gen AI LLM Certification Journey

AI Leap Forward: WizardLM Excels, SDXL 1.0 Transforms, Neural Magic Elevates!

NVIDIA Mixed Precision & Power Consumption - Part 1

Intel? oneAPI Perfomance Libraries- Part 2

???? Exploring the Evolution of GPU Architectures in Deep Learning: A Technical Odyssey ????

Machine Learning, Deep Learning and AI in Consumer Electronics (CE) - Let′s talk about CES and ICCE in Las Vegas

Software Eats Hardware: The Sequel

Will NPUs replace CPUs and GPUs

Empowering Data Scientists: The Game-Changing Role of GPUs in Deep Learning