ç™»å½•æŸ¥çœ‹æ›´å¤šå†…å®¹

AutoML: Neural nets that design neural nets

Ibrahim Sobh - PhD

?? Senior Expert of Artificial Intelligence, Valeo Group | LinkedIn Top Voice | Machine Learning | Deep Learning | Data Science | Computer Vision | NLP | Developer | Researcher | Lecturer

å‘å¸ƒæ—¥æœŸ: 2018å¹´8æœˆ19æ—¥

AutoML: how can we automate model selection and hyper-parameter optimization?

Sundar Pichai, CEO of Google wrote: â€œToday, designing neural nets is extremely time intensive, and requires an expertise that limits its use to a smaller community of scientists and engineers. Thatâ€™s why weâ€™ve created an approach called AutoML, showing that itâ€™s possible for neural nets to design neural nets.â€

Jeff Dean, Googleâ€™s Head of AI, said: â€œ100x computational power could replace the need for machine learning expertise.â€

Well, I donâ€™t totally agree!

Machine learning and deep learning practitioners actually do not need to design new neural networks architectures for their particular needs. Instead they need to use and come up with new effective ways and ideas for neural nets to generalize better to their tasks. For example, Transfer Learning is a technique that leverages the pre-trained models that usually have been trained on larger data sets for related tasks. Hence, there is no need to train a large model from scratch. Practically speaking for practitioners, most of the time (Like 99% of the time) there is no need to search for new network architectures using the â€œneural architecture searchâ€, for instance the AutoML provided by Google.

Neural architecture search

The basic idea is to have a controller Recurrent Neural Network (RNN) that samples building blocks and putting them together to create the new end-to-end architecture (that looks like ResNets). The new network is trained to obtain some accuracy on a validation set. The resulting accuracies are used to update the controller network in a way to enable the controller to generate "better" architectures over time. The controller weights are updated by policy gradient.

Figure from the main paper: "Learning Transferable Architectures for Scalable Image Recognition"

Just upload your data and Googleâ€™s algorithm will find you an architecture, quick and easy!

We need to understand when to use neural architecture search for finding/inventing new architectures for us. Here are some research examples (train to have the same accuracy for CIFAR-10 classification task):

NASNet: Learning Transferable Architectures for Scalable Image Recognition. NASNetâ€™s basic idea is to search for an architectural building block on a small data set (CIFAR-10) and then builds an architecture for a large data set (ImageNet). The search is computationally intensive, 1800 GPU days. Google used 500 GPUs for 4 days.
AmoebaNet: Regularized Evolution for Image Classifier Architecture Search. AmoebaNet consists of cells learned via an evolutionary algorithm that can match or surpass human-crafted and reinforcement learning designed image classifiers. However AmoebaNet was even more computationally intensive than NASNet, 3150 GPU days to learn the architecture.
ENAS: Efficient Neural Architecture Search. much (much) less expensive than standard Neural Architecture Search where it uses transfer learning and the research is done using a single GPU for just 16 hours.
DARTS: Differentiable Architecture Search. Research released by Carnegie Mellon University (CMU) and DeepMind, the space of candidate architectures is assumed to be continuous (much more efficient compared to black-box search). This allows the usage of gradient based approaches. DARTS takes only 4 GPU days

AutoML is Not only by Google:

Here are some examples of AutoML libraries: AutoWEKA, released in 2013 and automatically chooses a model and selects hyper-parameters. Auto-sklearn is an automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator. H2O AutoML. TPOT The Tree-Based Pipeline Optimization Tool. Auto-Keras (uses the ENAS algorithm) provides functions to automatically search for architecture and hyper-parameters of deep learning models. It is developed by DATA Lab at Texas A&M University and community contributors.

As a practitioner, conduct neural architecture search if you have a novel task that requires a special treatment and a very high performance metrics. In other words, conduct neural architecture search after trying other options and when you have no other option.

Best Regards,

Note: This article was NOT written by neural nets ;)

Navdeep Singh Gill

6 å¹´

Great Article?

èµž

å›žå¤

Ahmed Raafat

Unstructured Data Scientist, L3 @Beyond Limits | NLP & Speech & OCR & LLM specialist | Senior AI Engineer

6 å¹´

Great article, and from my perspective, I think it's one more way that Google uses to monopolize everything by making very high level programming and reducing innovation.

èµž

å›žå¤

1 æ¬¡å›žåº”

æŸ¥çœ‹æ›´å¤šè¯„è®º

è¦æŸ¥çœ‹æˆ–æ·»åŠ è¯„è®ºï¼Œè¯·ç™»å½•

Ibrahim Sobh - PhDçš„æ›´å¤šæ–‡ç«

The Evolution and Applications of Attention Mechanisms in Deep Learning: A Comprehensive Survey

2025å¹´3æœˆ1æ—¥

The Evolution and Applications of Attention Mechanisms in Deep Learning: A Comprehensive Survey

Article created by Perplexity Deep Research. Prompt: "You are a deep-learning experienced researcher.

1 æ¡è¯„è®º
The Judicial Cognitive Process: From Case Inception to Judgment and the Promise of AI Augmentation

2025å¹´3æœˆ1æ—¥

The Judicial Cognitive Process: From Case Inception to Judgment and the Promise of AI Augmentation

Research Report Created by Perplexity Deep Research My Research Question : "Now I want to dig deeper in the human judgeâ€¦

3 æ¡è¯„è®º
How to Learn Artificial Intelligence: A Beginnerâ€™s Guide

2024å¹´5æœˆ31æ—¥

How to Learn Artificial Intelligence: A Beginnerâ€™s Guide

Artificial Intelligence (AI) is a fascinating field that simulates human intelligence and task performance usingâ€¦
[????????????] ?????????????????? ???????????? explained with code ??

2023å¹´1æœˆ28æ—¥

[????????????] ?????????????????? ???????????? explained with code ??

"During the last two years there has been a plethora of large generative models such as ChatGPT or Stable Diffusionâ€¦

2 æ¡è¯„è®º
A conversation with ChatGPT about AI, study roadmap, applications, interview questions with answers, salaries, and more!

2023å¹´1æœˆ21æ—¥

A conversation with ChatGPT about AI, study roadmap, applications, interview questions with answers, salaries, and more!

Hello everyone, and thank you all for being here today! Let me introduce our new star, the ChatGPT, who will discussâ€¦
10 Object detectors with code [YOLOF, YOLOX, DETR, Deformable DETR, SparseR-CNN, VarifocalNet, PAA, SABL, ATSS, Double Heads]

2022å¹´2æœˆ17æ—¥

10 Object detectors with code [YOLOF, YOLOX, DETR, Deformable DETR, SparseR-CNN, VarifocalNet, PAA, SABL, ATSS, Double Heads]

In this article, 10 well-known pre-trained object detectors are loaded and used in a standard and easy way. YOLOF: Youâ€¦

6 æ¡è¯„è®º
FNet: Do we need the attention layer at all? [Explained with code]

2021å¹´10æœˆ30æ—¥

FNet: Do we need the attention layer at all? [Explained with code]

FNet: Mixing Tokens with Fourier Transforms "In this work, we investigate whether simpler token mixing mechanisms canâ€¦
Patches Are All You Need! [with code]

2021å¹´10æœˆ28æ—¥

Patches Are All You Need! [with code]

"It is only a matter of time before Transformers become the dominant architecture for vision domains, just as they haveâ€¦
MLP is all you need! [with code]

2021å¹´10æœˆ23æ—¥

MLP is all you need! [with code]

From Google: MLP-Mixer: An all-MLP Architecture for Vision Main idea: "While convolutions and attention are bothâ€¦

2 æ¡è¯„è®º
9 Steps for solving any machine learning problem

2021å¹´8æœˆ28æ—¥

9 Steps for solving any machine learning problem

In this article, we will present a universal blueprint that we can use to attack and solve any machine-learningâ€¦

2 æ¡è¯„è®º

See all articles

AutoML: Neural nets that design neural nets

Ibrahim Sobh - PhD

?? Senior Expert of Artificial Intelligence, Valeo Group | LinkedIn Top Voice | Machine Learning | Deep Learning | Data Science | Computer Vision | NLP | Developer | Researcher | Lecturer

Neural architecture search

Ibrahim Sobh - PhDçš„æ›´å¤šæ–‡ç«

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

Graph Transformers: Revolutionizing Graph Neural Networks

PINN: A birthplace of Safe LLMs

The Art of Balance: Understanding and Optimizing Machine Learning Models

A Practical Guide to Graph Convolutional Networks for Enterprise

Understanding Types of Classifiers in Machine Learning

AI Atlas #18: Graph Neural Networks (GNNs)

Table Parsing Made Simple with Homegrown Neural Networks - Part 5: Inference Pipeline for Thousands of Tables in Real Time

AI Engineering: Scaling your models with Ray Train for Blazing-Fast Performance

The Math Behind the Foundation of AI

Overcoming the Efficient Compute Frontier: The Promise of Liquid Neural Networks

Neural architecture search

Ibrahim Sobh - PhDçš„æ›´å¤šæ–‡ç«

The Evolution and Applications of Attention Mechanisms in Deep Learning: A Comprehensive Survey

The Judicial Cognitive Process: From Case Inception to Judgment and the Promise of AI Augmentation

How to Learn Artificial Intelligence: A Beginnerâ€™s Guide

[????????????] ?????????????????? ???????????? explained with code ??

A conversation with ChatGPT about AI, study roadmap, applications, interview questions with answers, salaries, and more!

10 Object detectors with code [YOLOF, YOLOX, DETR, Deformable DETR, SparseR-CNN, VarifocalNet, PAA, SABL, ATSS, Double Heads]

FNet: Do we need the attention layer at all? [Explained with code]

Patches Are All You Need! [with code]

MLP is all you need! [with code]

9 Steps for solving any machine learning problem

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

Graph Transformers: Revolutionizing Graph Neural Networks

PINN: A birthplace of Safe LLMs

The Art of Balance: Understanding and Optimizing Machine Learning Models

A Practical Guide to Graph Convolutional Networks for Enterprise

Understanding Types of Classifiers in Machine Learning

AI Atlas #18: Graph Neural Networks (GNNs)

Table Parsing Made Simple with Homegrown Neural Networks - Part 5: Inference Pipeline for Thousands of Tables in Real Time

AI Engineering: Scaling your models with Ray Train for Blazing-Fast Performance

The Math Behind the Foundation of AI

Overcoming the Efficient Compute Frontier: The Promise of Liquid Neural Networks

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†