What is AutoML? [Part 1]
Over the past decade, rapid growth of Machine Learning (ML) and its potentially high Return-on-Investment (ROI) has led to penetration of ML-based applications in various industries, and has generated significant demand for expert ML resources among all technical fields.
The 1) complexity of these ML models, 2) the lengthy processes of model development and deployment, and 3) the expensive human resource investment required (as data scientists and ML engineers) have created controversies over the potential level of ROI generated as a result of ML applications. Automated Machine Learning (AutoML) addresses these controversies by automating the end-to-end ML processes.
What is AutoML?
Conventional ML processes consist of three main components: Data Processing, Model Development, and Model Deployment. Data Processing refers to the processes of cleaning and normalizing the data, and selecting the right features to feed into the model. Model Development consists of selecting the family of algorithms, feeding training data into the model, tuning the parameters and finally evaluating the model.
Data Processing and Model Development are iterative processes on 1) various algorithms, 2) various feature sets, and 3) various parameters to find the most optimal model. As a result, they are extremely time-consuming and resource-intensive.
To improve these two processes, AutoML has,
1) introduced automation to the end-to-end iterative processes from data processing (cleaning/normalization and feature selection) to model search/development (model selection, hyperparameter tuning and model evaluation).
This automation enables parallelized development/evaluation of models with various algorithms, features, and parameters, in an automated fashion. And hence, it enables model evaluation in the most efficient way, without the need for manual exploration by data scientists/ML engineers.
Specific to Deep Learning, AutoML/DL will enable finding the right deep neural network architecture in an automated fashion. This includes multiple iterations to explore the right number of layers and neurons, without the need for manual exploration by data scientists/ML engineers.
2) simplified the whole process to reduce the need for deep technical experts, while keeping model accuracies high.
What are the benefits of AutoML?
In addition to efficiencies they create, AutoML aims to create easy-to-understand/use platforms to enable personas other than data scientists and ML engineers develop and deploy ML models. These personas include product managers, data analysts, software engineers and business analysts who are aspiring to solve a business/technical problem using ML.
In summary, AutoML creates the following values:
- Faster Time-to-Market (TTM): by bringing automation to the end-to-end processes.
- Cost Saving: by removing the need for expensive technical resources.
- Error Reduction: by removing the manual iterations in development and evaluation.
What are popular examples of AutoML?
One of the first AutoML packages that I got introduced to was AutoWEKA (as it could be combined with WEKA). Other popular packages are TIPOT which is developed on top of scikit-learn and MLBox developed at MIT. Specific to Deep Learning, Auto-PyTorch distributed under Apache license, and AutoKeras developed at Texas M&A are two popular solutions.
In the recent years, large corporations such as Amazon, Google, and Microsoft have invested heavily on AutoML as well. In the next article, I would like to introduce AutoML solutions from these corporates.
Engineer & Scrum Master | Zilveren Kruis
4 年Interesting article! However, I expect some data science jobs will be 'incrementally reshaped' rather than be eliminated since AutoML adoption will take some time. But I agree that we'll see some major changes/improvements happening to the Data Science domain eventually for exactly the reasons you described + many advantages in terms of scalability.
Director Of Business Development at ōURA
4 年How are you doing Yoshi - Fitbit
Senior Data Scientist? Fraud and Risk
4 年Well written with great examples. Thanks Niousha!
I help innovators use AI to future-proof their growth.
4 年I've seen too many AI/ML projects languish in a proof of concept stage because the business-side doesn't know what's possible to do with data science, and data scientists don't know what decisions the business-side should change. These types of AutoML solutions are great because they can be used in the workshop/discovery phase to show a general picture of what results might look like, and help both sides understand what a successful solution could look like. I still see data scientists as a heavy participant in the process—though there might be a shift to "applied" data science rather than theoretical or research.
Bell Labs | Fujitsu | Ericsson | Hitachi | Startup | Telecom I Gen-AI | Climate-tech
4 年Informative and well written, thank you Niousha.