AutoML: Neural nets that design neural nets

AutoML: Neural nets that design neural nets

AutoML: how can we automate model selection and hyper-parameter optimization?

Sundar Pichai, CEO of Google wrote: “Today, designing neural nets is extremely time intensive, and requires an expertise that limits its use to a smaller community of scientists and engineers. That’s why we’ve created an approach called AutoML, showing that it’s possible for neural nets to design neural nets.”

Jeff Dean, Google’s Head of AI, said: “100x computational power could replace the need for machine learning expertise.”

Well, I don’t totally agree!

Machine learning and deep learning practitioners actually do not need to design new neural networks architectures for their particular needs. Instead they need to use and come up with new effective ways and ideas for neural nets to generalize better to their tasks. For example, Transfer Learning is a technique that leverages the pre-trained models that usually have been trained on larger data sets for related tasks. Hence, there is no need to train a large model from scratch. Practically speaking for practitioners, most of the time (Like 99% of the time) there is no need to search for new network architectures using the “neural architecture search”, for instance the AutoML provided by Google.

Neural architecture search

 The basic idea is to have a controller Recurrent Neural Network (RNN) that samples building blocks and putting them together to create the new end-to-end architecture (that looks like ResNets). The new network is trained to obtain some accuracy on a validation set. The resulting accuracies are used to update the controller network in a way to enable the controller to generate "better" architectures over time. The controller weights are updated by policy gradient.

Figure from the main paper: "Learning Transferable Architectures for Scalable Image Recognition"

Just upload your data and Google’s algorithm will find you an architecture, quick and easy!

We need to understand when to use neural architecture search for finding/inventing new architectures for us. Here are some research examples (train to have the same accuracy for CIFAR-10 classification task):

  • NASNet: Learning Transferable Architectures for Scalable Image Recognition. NASNet’s basic idea is to search for an architectural building block on a small data set (CIFAR-10) and then builds an architecture for a large data set (ImageNet). The search is computationally intensive, 1800 GPU days. Google used 500 GPUs for 4 days.
  • AmoebaNet: Regularized Evolution for Image Classifier Architecture Search. AmoebaNet consists of cells learned via an evolutionary algorithm that can match or surpass human-crafted and reinforcement learning designed image classifiers. However AmoebaNet was even more computationally intensive than NASNet, 3150 GPU days to learn the architecture.
  • ENAS: Efficient Neural Architecture Search. much (much) less expensive than standard Neural Architecture Search where it uses transfer learning and the research is done using a single GPU for just 16 hours.
  • DARTS: Differentiable Architecture Search. Research released by Carnegie Mellon University (CMU) and DeepMind, the space of candidate architectures is assumed to be continuous (much more efficient compared to black-box search). This allows the usage of gradient based approaches. DARTS takes only 4 GPU days

AutoML is Not only by Google:

Here are some examples of AutoML libraries: AutoWEKA, released in 2013 and automatically chooses a model and selects hyper-parameters. Auto-sklearn is an automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator. H2O AutoML. TPOT The Tree-Based Pipeline Optimization Tool. Auto-Keras (uses the ENAS algorithm) provides functions to automatically search for architecture and hyper-parameters of deep learning models. It is developed by DATA Lab at Texas A&M University and community contributors.

As a practitioner, conduct neural architecture search if you have a novel task that requires a special treatment and a very high performance metrics. In other words, conduct neural architecture search after trying other options and when you have no other option.

Best Regards,

Note: This article was NOT written by neural nets ;)

Navdeep Singh Gill

Global CEO | Agentic AI | Physical AI | AGI and Quantum Futurist | Author | Speaker

6 å¹´

Great Article?

赞
回复
Ahmed Raafat

Unstructured Data Scientist, L3 @Beyond Limits | NLP & Speech & OCR & LLM specialist | Senior AI Engineer

6 å¹´

Great article, and from my perspective, I think it's one more way that Google uses to monopolize everything by making very high level programming and reducing innovation.

要查看或添加评论,请登录

Ibrahim Sobh - PhD的更多文章

社区洞察

其他会员也浏览了