Build vs Buy: If you are Buying Machine Learning, albeit you are Doing it Wrong
Dr M Maruf Hossain, PhD
Global Top 100 Innovators in Data and Analytics 2024 | Leading organisational transformation with Data, AI, and Automation | Thought Leadership | Strategy to Execution | Keynote Speaker | ex-IBM, Infosys, Telstra | INTJ
Artificial Intelligence (AI) or more precisely Machine Learning (ML) has become an industry trend in the past 10 years. From a buzzword to a new way of automation and decision-making, ML has become mainstream. I have engaged in conversations with multiple organisations keen to do ML, regardless of having a real use case. Graduate schools are offering lightweight courses to the mass, consulting companies are providing services on adopting ML, and technology companies are building and selling products using ML.
I’m not going to recap the definition of ML, there are already hundreds of sites providing the same. Neither I’m going to talk about what machines learn nor how they learn. Instead, I’d encourage you to think about your own childhood. How did you learn?
Most humans learn through reading, asking questions, observing, etc. Regardless of, whether it is human or machine, one thing that is common in the learning process is the need for samples or examples. A kid is shown a ball, a car, or the colour red. And they learn to identify the object. A kid with a learning difficulty (unfortunately, 40 years ago, we might not hessite to call it a 'dumb kid') might need to be told more than once. In that aspect, our computers are the dumbest. Despite their speed in execution, they need thousands of samples to learn a single concept. Not only it needs a large amount of data to learn a concept, but also it needs a substantial amount of time and computing power to learn. Practitioners call this process model training.
Often these learnings don’t translate very well. For example, when I first came to Australia, I realise that what we call ‘orange’ back home, is called ‘mandarin’ here. And orange is something different altogether. I’ve learned that fact and adjusted my vocabulary. But for a model to adapt information this simple, the entire time and processing heavy model training process are needed to revisit. Though in deep learning, there is this concept called reinforcement learning, which is, too, another time-consuming round of model training.
Ergo, models trained on one set of data often don’t work very well on another set of data. IBM Watson Health is the prime example of such a million-dollar failure. Their biggest setbacks were the revelation that its?cancer diagnostics tool?was not trained with real patient data, but instead with hypothetical cases provided by a small group of doctors in a single hospital. Synthetic data is usually bad for ML as it usually doesn’t generalise?well for the population and is very hard to match the distribution of the population. Which causes blind spots and wasn’t necessarily generalisable to all cases. Even with using real-world data, there is no guarantee that the model will not encounter completely new patterns in real-world scenarios, thus requiring a model re-training mechanism in place.
IBM’s failure is expedited by other factors, such as considering ML like any other software and trying to apply a pre-built solution to any problem they could get their hands on. ML doesn’t scale very well with this approach. It requires business acumen for a solution to work. Moreover, marketing hype never outpaces accountability. While IBM poured money into marketing, they neither have the results out of the box nor had a proper mechanism to operationalise machine learning, known as MLOps, to live up to the hype they were creating. Disruptive innovations are always a gamble, and without proper and thorough testing it is hard to quantify their effects.
AutoML to rescue the followers, But…
Other companies like Google, which was already on the path of developing and selling machine learning models, either added a mechanism to operationalise machine learning in their product suite or completely repurpose their product into an MLOps solution. Furthermore, to attract more people into using their solution they have added another technically failed mechanism into the mix: AutoML, which is nothing more than a bunch of algorithms that are applied to your data and whichever algorithm produces the best metric on the “test data” that you’ve provided, is selected as the model. This selection can be tricky, and there are several questions that cannot be answered by the metric alone. Such as: were the test data had good coverage? Can a different model perform better if (a) thousand more samples are added to the test data, or (b) a different set of data is used as test data? The only upside of AutoML is it gives a baseline very quickly, compared to using a data scientist. But those baseline models are hardly production-worthy.
There are several downfalls when using AutoML. First, it gives a false sense of security. Initially, it works with the given data and makes organisations comfortable using it, and just like any automation, the more you use it, the more catastrophic it is when it fails. Because of this, it’s easy to introduce data bugs. And due to AutoML’s sometimes opaque nature, these bugs are very hard to spot. In a neptune.ai blog, Alexandru Burlacu recorded that Google Vision AutoML (beta) was using training data in the validation set and thus reporting 98.8% accuracy on a binary classification problem, while their custom build solution couldn’t produce more than 69%. Once they fixed the problem of Google Vision AutoML, it only yielded 67%.
领英推荐
As AutoML gets too fixated on the data used in training and validating, it is always prone to overfitting and spends too much time and computing on optimisation, thus ending up over-optimisation for the given data. This is the primary reason at the initial stage AutoML solutions show excellent results and almost always fail in production in the long run.
Finally, often AutoML combines multiple algorithms making it harder to interpret, especially when interpretability and explainability requirements are paramount. And if the model is underperforming and needs to be debugged, this drawback can often render the models generated by AutoML useless.
Only very trivial scenario is usually fully covered by AutoML, which are usually demonstrated by the vendor while selling their products. Their sales techniques remind me of charlatans selling snake oils. They advertise ML success stories as an achievement of their tools. It is like saying that the car is responsible for an accident and not the driver. Search for “formula one winners” in Google. Do you see Ferrari, McLaren, or any other makers on the list? No, you see the drivers behind the wheel. The same is true for everything, including ML.
What organisations should really do?
While buying pre-built models or AutoML is outright wrong, buying an MLOps solution or a platform for your MLOps to run is indeed a better choice. But buying the platform is not what brings success. A business is yet to convert its investment into success. The value must be exploited out of those solutions. And that value comes not only from internally owned data but also from how well that data is processed and interpreted into business outcomes.
Before embracing an ML journey, first, do a self-assessment. Here is a quick questionnaire:
If you answer ‘yes’ to the first two questions, then go for a solution. But you are not ready for ML yet unless you have answered ‘yes’ to the third question.
Remember, our businesses are neither data- nor process-driven, but indeed, value-driven.
Product Owner| Delivery Management l Vice President at ANZ| Distinguished Toastmaster
2 年Very nicely put Maruf