Deep Learning vs Traditional Machine Learning... Which one I should use?
Rohan Chikorde
VP - AIML at BNY Mellon | 17k+ followers | AIML Corporate Trainer | University Professor | Speaker
Over the past several years, deep learning has become the go-to technique for most AI type problems, overshadowing classical machine learning. The clear reason for this is that deep learning has repeatedly demonstrated its superior performance on a wide variety of tasks including speech, natural language, vision, and playing games. Yet although deep learning has such high performance, there are still a few advantages to using classical machine learning and a number of specific situations where you’d be much better off using something like a linear regression or decision tree rather than a big deep network. I will help you to understand when to go with deep learning and when to choose traditional machine learning algorithms
Choose Deep Learning when ...
Best-in-class performance:
Deep networks have achieved accuracy that are far beyond that of classical ML methods in many domains including speech, natural language, vision, and playing games. That doesn't mean you should apply deep learning algorithms in every other business case study. In many tasks where you have hundreds or thousands of features, classical ML can’t even compete. In those cases, deep learning might work better.
Scales effectively with data:
Deep networks scale much better with more data than classical ML algorithms. The graph above is a simple yet effective illustration of this. Often times, the best advice to improve accuracy with a deep network is just to use more data! With classical ML algorithms this quick and easy fix doesn’t work even nearly as well and more complex methods are often required to improve accuracy.
Less need for feature engineering:
Classical ML algorithms often require complex feature engineering. Usually, a deep dive exploratory data analysis is first performed on the dataset. A dimensionality reduction might then be done for easier processing. Finally, the best features must be carefully selected to pass over to the ML algorithm. There’s less need for this when using a deep network as one can just pass the data directly to the network and usually achieve good performance right off the bat (assuming you have other parameters correct). This somewhat eliminates the big and challenging feature engineering stage of the whole process. However, on top of that if you can apply domain expertise and draw new features, you might get even better results.
Adaptable and transferable:
Deep learning techniques can be adapted to different domains and applications far more easily than classical ML algorithms. Firstly, transfer learning has made it effective to use pre-trained deep networks for different applications within the same domain. For example, in computer vision, pre-trained image classification networks are often used as a feature extraction front-end to object detection and segmentation networks. The use of these pre-trained networks as front-ends eases the full model’s training and often helps achieve higher performance in a shorter period of time. In addition, the same underlying ideas and techniques of deep learning used in different domains are often quite transferable. For example, once one understands the underlying deep learning theory for the domain of speech recognition, then learning how to apply deep networks to natural language processing isn’t too challenging since the baseline knowledge is quite similar. With classical ML this isn’t the case at all as both domain specific and application specific ML techniques and feature engineering are required to build high-performance ML models. The knowledge base of classical ML for different domains and applications is quite different and often requires extensive specialized study within each individual area.
Choose Machine Learning when ...
Works better on small and medium data:
To achieve high performance, deep networks require extremely large datasets. Some pre-trained networks such as ResNet or VGG etc were trained on millions of images. For many applications, such large datasets are not readily available and will be expensive and time consuming to acquire. For smaller datasets, classical ML algorithms often outperform deep networks.
Financially and computationally cheap:
Deep networks require high-end GPUs to be trained in a reasonable amount of time with big data. These GPUs are very expensive yet without them training deep networks to high performance would not be practically feasible. To use such high-end GPUs effectively, a fast CPU, SSD storage, and fast and large RAM are all also required. Classical ML algorithms can be trained just fine with just a decent CPU, without requiring the best of the best hardware. Because they aren’t so computationally expensive, one can also iterate faster and try out many different techniques in a shorter period of time.
Easier to interpret:
Due to the direct feature engineering involved in classical ML, these algorithms are quite easy to interpret and understand. In addition, tuning hyper-parameters and altering the model designs is more straightforward since we have a more thorough understanding of the data and underlying algorithms. On the other hand, deep networks are “statistical black box” in that even now researchers do not fully understand the “inside” of deep networks. Hyper-parameters and network design are also quite a challenge due to the lacking theoretical foundation.
Conclusion
There you have it! Your comparison of Classic Machine Learning and Deep Learning. I hope you enjoyed this article and learned something new and useful. Feel free to like, share and comment.