AutoML: Neural nets that design neural nets
Ibrahim Sobh - PhD
?? Senior Expert of Artificial Intelligence, Valeo Group | LinkedIn Top Voice | Machine Learning | Deep Learning | Data Science | Computer Vision | NLP | Developer | Researcher | Lecturer
AutoML: how can we automate model selection and hyper-parameter optimization?
Sundar Pichai, CEO of Google wrote: “Today, designing neural nets is extremely time intensive, and requires an expertise that limits its use to a smaller community of scientists and engineers. That’s why we’ve created an approach called AutoML, showing that it’s possible for neural nets to design neural nets.â€
Jeff Dean, Google’s Head of AI, said: “100x computational power could replace the need for machine learning expertise.â€
Well, I don’t totally agree!
Machine learning and deep learning practitioners actually do not need to design new neural networks architectures for their particular needs. Instead they need to use and come up with new effective ways and ideas for neural nets to generalize better to their tasks. For example, Transfer Learning is a technique that leverages the pre-trained models that usually have been trained on larger data sets for related tasks. Hence, there is no need to train a large model from scratch. Practically speaking for practitioners, most of the time (Like 99% of the time) there is no need to search for new network architectures using the “neural architecture searchâ€, for instance the AutoML provided by Google.
Neural architecture search
The basic idea is to have a controller Recurrent Neural Network (RNN) that samples building blocks and putting them together to create the new end-to-end architecture (that looks like ResNets). The new network is trained to obtain some accuracy on a validation set. The resulting accuracies are used to update the controller network in a way to enable the controller to generate "better" architectures over time. The controller weights are updated by policy gradient.
Figure from the main paper: "Learning Transferable Architectures for Scalable Image Recognition"
Just upload your data and Google’s algorithm will find you an architecture, quick and easy!
We need to understand when to use neural architecture search for finding/inventing new architectures for us. Here are some research examples (train to have the same accuracy for CIFAR-10 classification task):
- NASNet: Learning Transferable Architectures for Scalable Image Recognition. NASNet’s basic idea is to search for an architectural building block on a small data set (CIFAR-10) and then builds an architecture for a large data set (ImageNet). The search is computationally intensive, 1800 GPU days. Google used 500 GPUs for 4 days.
- AmoebaNet: Regularized Evolution for Image Classifier Architecture Search. AmoebaNet consists of cells learned via an evolutionary algorithm that can match or surpass human-crafted and reinforcement learning designed image classifiers. However AmoebaNet was even more computationally intensive than NASNet, 3150 GPU days to learn the architecture.
- ENAS: Efficient Neural Architecture Search. much (much) less expensive than standard Neural Architecture Search where it uses transfer learning and the research is done using a single GPU for just 16 hours.
- DARTS: Differentiable Architecture Search. Research released by Carnegie Mellon University (CMU) and DeepMind, the space of candidate architectures is assumed to be continuous (much more efficient compared to black-box search). This allows the usage of gradient based approaches. DARTS takes only 4 GPU days
AutoML is Not only by Google:
Here are some examples of AutoML libraries: AutoWEKA, released in 2013 and automatically chooses a model and selects hyper-parameters. Auto-sklearn is an automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator. H2O AutoML. TPOT The Tree-Based Pipeline Optimization Tool. Auto-Keras (uses the ENAS algorithm) provides functions to automatically search for architecture and hyper-parameters of deep learning models. It is developed by DATA Lab at Texas A&M University and community contributors.
As a practitioner, conduct neural architecture search if you have a novel task that requires a special treatment and a very high performance metrics. In other words, conduct neural architecture search after trying other options and when you have no other option.
Best Regards,
Note: This article was NOT written by neural nets ;)
Global CEO | Agentic AI | Physical AI | AGI and Quantum Futurist | Author | Speaker
6 å¹´Great Article?
Unstructured Data Scientist, L3 @Beyond Limits | NLP & Speech & OCR & LLM specialist | Senior AI Engineer
6 å¹´Great article, and from my perspective, I think it's one more way that Google uses to monopolize everything by making very high level programming and reducing innovation.