Machine Learning Edition GPU vs CPU and their characteristics in ML

Machine Learning Edition GPU vs CPU and their characteristics in ML

GPUs and CPUs in Machine Learning

GPU (Graphical Processing Unit) or what we often all a Graphics card computing has been one of the main themes in Machine Learning in past decade. The main reason for this is of course the fact that GPU computing enables multiple fold times to thousands of times faster training and processing other segments of Machine Learning models compared to CPUs (Central Processing Units) or the Processors. The main problem is that not all Machine Learning Engineers understand these principles at the hardware level and take many general wisdoms in GPU computing for granted. To understand these principles i will focus on main parts of both CPUs and GPUs which are called the Cores.


The CPU Cores

These cores are main 'brains' of mathematical computations used in Machine Learning. But the strategy used in CPUs and GPUs is different and this reflects on the Machine Learning Performance. CPUs are versatile, general purpose processors which makes them ideal for multi-tasking computer environment, where at same time we can load files from the memory, surf the internet and use some computer programs at the same time. This is the reason why we can open many different programs and perform different tasks at high speed on computers, smart-phones and other devices.

Sometime ago CPUs had only 1 Core, and these cores could process different task one at the time, meaning that certain tasks needed other tasks to finish so they could be started. This is called Sequential processing.

In modern days, companies developed CPUs which have more of these if its 2 cores, its a Dual Core, 4 cores - Quad Core, 8 cores - Octa Core CPUs and so on. Some of the best CPUs available today have around 64 Cores.

One thing to keep in mind is that CPU cores are very powerful in latest Processors. The individual cores of CPUs are some of the best in the industry, the only problem may arise with huge amounts of computation actually being supported by small number of these cores in CPUs typically ranging from 1 to 8.


The Processing Parallelism

What did this enable ? It enabled something called Processing Parallelism. If i take and example of having an Octa-core CPU, it can perform at least 8 tasks simultaneously without having to them to wait for each other to be finished, of course its can drastically increase speed. Today, when we speak about CPU speed we aren't talking to much about number GHz like before, but rather about the number of Cores present in the CPU. These can further have Threads which can further improve the parallelism and speed of a CPU


GPU power and the GPU Cores

But this strategy was actually developed long time ago. Graphics cards were used mainly for video games for decades and these video games were generally most computationally intensive processes. So GPUs were actually developed using many simpler Cores to resolve the computational needs of Video Games and used the same Parallelism Principle i mentioned before, but with much larger number of cores ranging from few hundred on the low ends to hundreds of thousands in most advanced graphic cards . These cores are specialized to perform similar tasks and are less versatile compared to CPU cores, but their number allows for powerful parallelism. Using thousands of cores simultaneously enabled the powerful graphics we have today in the video games.


GPUs in Machine Learning

Machine Learning Training is often composed of

A. Repetitive tasks such as iterations during training

B. Mathematical Operations which are similar in terms of programming nature

C. Processes which require processing large amounts of data

These features of Machine Learning make it ideal to be implemented via GPUs which can provide parallels use of thousands of GPU cores simultaneously to speed up the computation. In my experience, GPUs perform 5-20 times faster than CPUs i used, but this difference can be even larger.


CPUs are also important in GPU computing

It is the general wisdom that Data Scientists chose between GPUs or CPUs. In reality, we do chose GPUs for complex tasks but most of the time in combination with CPUs. Central Processors still have the versatile and all-purpose nature we need for communicating with different parts of computer memory and performing versatile tasks even in GPU based processes. I generally tend to follow the percentage use of my CPU and GPU when performing Machine Learning tasks. CPUs will go to high percentages in CPU computing, GPUs will go to high percentages in GPU computing, but there is an important segment Data Scientists should observe. When performing GPU tasks, what is the usage of CPUs? This can help answer the amount of work and focus placed on these two processing units and most of the time GPU percentage and temperature will go up, while CPU remaining active enough to support other processed required to support the Graphics card.

Today different new approaches are deployed in high performance Machine Learning like TPUs (Tensor Processing Units) which will be the theme in future editions.


By

Darko Medin, a Statistics and Machine Learning Expert


Darko Medin

Data Scientist and a Biostatistician. Developer of ML/AI models. Researcher in the fields of Biology and Clinical Research. Helping companies with Digital products, Artificial intelligence, Machine Learning.

2 年

GPUs and CPUs have been the main topic in this edition, but in the next editions TPUs (Tensor Processing Units) will also be covered as special topic. Thanks for reading!

回复

要查看或添加评论,请登录

Darko Medin的更多文章

社区洞察

其他会员也浏览了