Black Box Machine Learning may be harmful
Pranab Ghosh
AI Consultant || MIT Alumni || Entrepreneur || Open Source Project Owner || Blogger
Some machine learning solutions are being marketed as easy to use, point and click product that can treated as a black box. The reason is understandable. Being touted as easy to use, the vendors get a wider reach for their product. For machine learning practitioners, they get instant gratification with such products.
If you are working on some critical Machine Learning application, taking such short cut for the sake of expediency is not advisable. There may be serious consequences.
Unfortunately, there is no equivalency between Machine Learning and most other technologies. It's a deep and complex multi disciplinary field founded on Math, Statistics, Information Theory and Neuro Science. There is no easy and quick way to become an expert.
Selecting the appropriate algorithm for a given problem and data set is not trivial. Once one or more candidate algorithms have been selected, tuning the the algorithm through the various configuration parameters is a tedious and painstaking process. Same comment applies to feature engineering.
Point and click and voila
A vendor may claim that you can just feed some training data and click a button and voila you have predictive model. Any experienced Machine Learning practitioner will be skeptical and immediately ask the following questions. They will be curious about the inner workings of the product under the hood.
- What is the algorithm?
- How was the algorithm selected?
- What are the tuning parameters?
- If the tuning parameters are not exposed to the user, what are they set to?
- What kind of validation technique was used while training the model?
- What kind of feature engineering was done?
- What performance criteria was used to train the model?
From novice to maverick
Let's consider Machine Learning practitioners at three levels of expertise and see how they might respond to the black box Machine Learning products.
- Novice : They have shallow understanding of machine learning theory and algorithms. They are not knowledgeable about the nuances of different algorithms. They are more likely to be open to the black box solutions.
- Advanced: They have deep knowledge of machine learning theory and different algorithms. They are comfortable selecting appropriate algorithm for a problem and tune the algorithm. They are likely to be very skeptical about black box solutions.
- Maverick: These people have the all the traits of the advanced level Machine Learning engineers and more. They have very deep knowledge and always curious. If the implementation source code is available, they are likely to look at it to gain understanding of the inner working of the algorithms. They may have ideas about improving an algorithm and implement their own. It's difficult to imagine these people using black box Machine Learning solutions.
Ease of use is always a desirable goal for any product. However, there is limit to how easy a product can be made, before you start sacrificing value and quality.