Chasing Two Rabbits - Bringing AI into the Enterprise
Alan A. Varghese
Industry Consultant in Wireless, Satcom, IoT, Cable & Optical, Automotive, Semiconductors, Aerospace
At Microsoft’s Inspire 2019 partner conference, CEO Satya Nadella said something that caught my attention. He said that in 2030, tech spend will be 10% of global GDP and will amount to a staggering $14 trillion dollars! But that was not the part that caught my attention, it was what he said immediately after, i.e. though the 10% is important, the remainder 90% of global GDP represents the opportunity for the company - as businesses, enterprises, and industries worldwide get digitized.
Digitization is about connecting things and processes to provide insights; it offers the potential for more agile business operations, closer employee engagement, and improved customer experiences. Digitization will generate data at high volume, variety, and velocity and so - capabilities such as Artificial intelligence (AI) and Machine learning (ML) need to be concurrently deployed in order to convert data to insights to action.
The challenges with deploying Artificial Intelligence and Machine learning
As I talk to different enterprises, they say that the challenges with deploying AI and ML are considerable and there are many decisions that need to be made - starting with determining areas that would be a good fit for AI and ML. Data has to be cleaned and pre-processed which can take as much as 50-60% of the project time, nulls and outliers have to be dealt with, and rescaling of the data may be required. The data scientist has to then determine the features that are important, and choose the ML model from a plethora of choices (Bayesian regression, Gradient boosted, Nearest neighbors, Support vector machine…). There are clustering models, classification and regression models, association rule models; no single model works best on all datasets, and different variables, coefficients, and hyperparameters have to be tried. After choosing a model, hyperparameters will probably have to be tuned and it may be necessary to reprocess the data.
This process of finding the best ML model is a trial-and-error, iterative method which can take a lot of time (unless just a few parameters need tuning, or a lot of money is spent on high-power processing platforms). The data scientist has to decide how many of these iterations to do, whether to try different model families, whether to further tune hyperparameters, and whether to send data back for preprocessing. Even if the enterprise has enough data scientists and software engineers, these professionals may have to spend hundreds of hours each month building and maintaining these ML projects; and aiming for higher accuracy results would mean delaying other enterprise ML projects. What makes things worse is when models need to be continuously upgraded, for example in the field of financial applications there may be new regulations to be considered, or in the case of retail, the latest market feedback.
Automating the steps of Machine learning
Wouldn’t it be great if all of these steps could be done automatically, and there existed a service that identifies the best machine learning pipelines? The program would take in the enterprise’s dataset, optimization metrics, and any constraints in time and cost as inputs - and come up with a ML model. This would enable a large number of iterations to be run over many combinations of algorithms and hyperparameters, which would result in the best model for the input criteria.
Microsoft’s Automated ML
Well this seems to be what Microsoft’s Automated ML product available through their Azure Machine Learning service does; it automates feature engineering, algorithm and hyperparameter tuning to find the best model for enterprise data. Last week I got the opportunity to attend a session on Automated ML at Microsoft’s office in Atlanta. A lot of thanks to Mark Tabladillo - Data and Artificial Intelligence Scientist at Microsoft - for an excellent presentation explaining the concepts behind Automated ML.
Domain specific pretrained models for Language, Search, Speech, and Vision applications - as well as popular ML frameworks such as TensorFlow, PyTorch, ONNX, and Scikit-learn are supported, on processors such as CPU, FPGA, and GPU. The program takes in input data along with goals and constraints and tests multiple models in parallel, reducing development time. Guardrails can be placed such as detection of high cardinality features, class imbalance, leaky features, overfitting, missing value imputation.
Microsoft’s aim with Automated ML seems to be to democratize and scale AI, and enable developers to rapidly build solutions. Some of the companies using Azure for ML include Accenture, BT, Carnival Cruises, Jabil, Juniper, Lennox, Mercedes-Benz, the NBA, Progressive Insurance, Schneider Electric, Uber, UPS.
Chasing two rabbits
“The man who chases two rabbits, catches neither” is a quote attributed to Chinese sage Confucius. Enterprises today feel like they have to chase two rabbits. On the one hand they realize that their entire industry, supply chain, and ecosystem is getting digitized - and they feel pressure to pursue the same, and quickly. At the same time, in order to manage the data that digitization produces, they realize they need to simultaneously pursue new technologies such as artificial intelligence and machine learning.
With the Automated ML product, Microsoft is helping chase down one of the rabbits. Enterprises can now focus on chasing down the other.
P.S. and Questions:
1) What are your thoughts regarding digitization; do you feel the pace of digitization in your industry is too fast, and the RoI is still not clear? 2) Do you feel along with digitization you have to concurrently deploy AI and ML in the enterprise; and what are the challenges?
Feel free to comment or message me.
Good observation, Alan. Just to add, regardless of the tools and platforms, we see that building production and enterprise quality end to end ML projects need seasoned teams and SMEs. We spend a lot of evangelizing and educating our customers on operationalizing these ML projects. Often times that becomes the catalyst for success rather than the platform or tool used.?