登录查看更多内容

Deep learning: a new revolution has started that resembles how our brain works

Servio Lima

Telecom expert

发布日期: 2019年5月20日

Introduction

Since 2012, deep learning is causing a great revolt in several fields: medicine, finance, manufacturing, arts, etc. According to DARPA these networks that can be classified in two: deep learning for data analysis (i.e. supervised, unsupervised classification/prediction) and deep learning for operations (i.e. deep reinforcement learning used in Alpha Zero), covers a large range of algorithms.

Figure 1: neural network zoo (source: https://www.asimovinstitute.org/neural-network-zoo/)

But the common factor among those algorithms is that they are based in discrete number of neurons and layers. See figure 1 for a comprehensive list of discrete deep learning networks. But this has already changed.

Introducing neural ordinary differential equations (ODEs)

In December 2018, at NEURIPS conference, one of the biggest conferences of artificial intelligence in the world, out of more than 4000 papers, one shined as the best one: Neural Ordinary Differential Equations (ODEs) [1]. Its idea is revolutionary and here we explain why.

The authors of ODEs argues that, since the introduction of RESNET (residual networks) by Microsoft in 2015 [2], they realized that this network outperformed others, just with the simple idea of skipping neural layers. See figure 2 for understanding RESNET architecture.

Figure 2: RESNET architecture at the neural node level

The reason behind this is that each layer introduces errors, so it makes sense to skip them. But RESNET are still discrete networks. So, it made sense to ask a simple question: what if we get rid of all the layers and consider a deep neural network as a continuous function, where we can approximate a neuron as a derivate in any point of the curve?

Then, the authors of the ODE paper realized that RESNET functions resembled Eulers method for ordinary differential equations. That Euler formula is expressed as follows:

Figura 3: Euler discretization of a continuous transformation

Evidence in Auto Machine Learning (AutoML)

On the other hand, Auto Machine Learning that was launched as a service in Google IO 2019, corroborates this fact. It happens that when the AutoML algorithm tries to decide which is the best topology for a convolutional neural network, it delivers a network similar to RESNET, because it tries to skip layers in order to reduce the errors introduced in each layer. See Figure 4 for the details of the convolutional network discovered automatically by AutoML.

Figure 4: Convolutional architecture discovered by Google AutoML (Source: Google IO 2019)

Moreover, it is important to note that nowadays AutoML outperforms traditional convolutional networks in terms of accuracy. See figure 5 for more details.

Figure 5: AutoML outperforming traditional convolutional networks used for image classification and prediction.

The case for ODEs

So, there is mounting evidence that an accurate deep learning network should be continuous instead of discrete. We can say that a deep neural network can be approximated as an ODE function and solved with a traditional ODE solver such as the ones described [3], [4]. This will solve the forward propagation of the network.

In terms of back propagation, we can no longer use gradient descend. Instead we have to use another method named the Adjoint sensitivity method [5] whose representation is depicted in Figure 6.

Figure 6: Adjoint sensitivity method for solving neural ODE back propagation

Few articles has been written in this matter, but this is really a breakthrough since ODEs are a big step for implementing a continuous like network that are more similar to our brains than discrete neural networks.

Moreover they are memory efficient, reduce consumption of GPU/TPUs according to the ODE solver used, reduce the need to define hyper parameters (i.e. number of layers) among other benefits.

Future work

With Neural ordinary differential networks, explainable AI becomes more difficult since now the challenge is to explain how a continuous function arrived to a conclusion.

On the other hand, generative adversarial networks (GANs) could be re implemented using ODEs inside instead of convolutional neural networks (CNNs) for more accurate image generation, classification and prediction. This may have a huge impact in fields such as medicine (i.e. cancer, diabetes prediction), drug discovery, finance (i.e. stock market investments, credit default prediction) among other fields.

Conclusion

As we have seen in this article, the field is open for a new kind of research and researchers. A new way of thinking deep neural network to make them infinitely deep has been introduced and this may really pave the way to achieving artificial general intelligence (AGI). Only the becoming years will tell us this. But the future is exciting.

Bibliography

[1] Tian Qi Chen · Yulia Rubanova · Jesse Bettencourt · David Duvenaud , Neural Ordinary Differential Equations. NEURIPS Dec 2018

[2] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Deep Residual Learning for Image Recognition. Computer Vision and Pattern Recognition. 2015

[3] C. Runge. über die numerische Aufl?sung von Differentialgleichungen. Mathematische Annalen, 46: 167–178, 1895.

[4] E. Hairer, S.P. N?rsett, and G.Wanner. Solving Ordinary Differential Equations I – Nonstiff Problems. Springer, 1987.

[5] Lev Semenovich Pontryagin, EF Mishchenko, VG Boltyanskii, and RV Gamkrelidze. The mathematical theory of optimal processes. 1962.

DR. Marcelo Giovanni Mu?oz Rojas ???? ????

Investigador y Consultor Fortune 500, Speaker TEDx y Autor. Apoyo a empresas en Gobierno Corporativo, Transformación Digital, Liderazgo y Gestión del Cambio, integrando IA para potenciar crecimiento y adaptación ágil.

3 个月

Buen punto Servio. Gracias por compartir

要查看或添加评论，请登录

Servio Lima的更多文章

Cognitive biases in the corporate world: How many of them have you had in the past 24 hours?

2019年2月17日

Cognitive biases in the corporate world: How many of them have you had in the past 24 hours?

The term cognitive bias was introduced in the early 70′s by Amost Tversky and Daniel Kahneman [1]. The latter won the…
Deep learning algorithms applied to medicine research

2019年1月20日

Deep learning algorithms applied to medicine research

Abstract There are several applications for Deep Learning (DL) in medicine. Among the most popular are image…
Deep learning for fraud detection in the banking industry

2018年12月24日

Deep learning for fraud detection in the banking industry

By Servio Fernando Lima Reina Traditional fraud protection methods for the banking industry have been rule based, where…

Deep learning: a new revolution has started that resembles how our brain works

Servio Lima

Telecom expert

Introduction

Introducing neural ordinary differential equations (ODEs)

Evidence in Auto Machine Learning (AutoML)

The case for ODEs

Future work

Conclusion

Bibliography

Servio Lima的更多文章

社区洞察

其他会员也浏览了

The 10 Deep Learning Methods AI Practitioners Need to Apply

MNIST Handwritten Digits Classification Using a Convolutional Neural Network

Exploring the magic of Neural Networks

Hyperparameter Tuning: The Dexter and Didi Approach to Machine Learning

Unveiling Uncertainty: A Journey into Bayesian Deep Learning and Neural Network Uncertainty

Feature Interaction Detection with SHAP: Revealing Complex Dependencies Between Variables

Deep Learning: From Cats to Encryption to Skynet

Introduction

Introducing neural ordinary differential equations (ODEs)

Evidence in Auto Machine Learning (AutoML)

The case for ODEs

Future work

Conclusion

Bibliography

Servio Lima的更多文章

Cognitive biases in the corporate world: How many of them have you had in the past 24 hours?

Deep learning algorithms applied to medicine research

Deep learning for fraud detection in the banking industry

社区洞察

其他会员也浏览了

The 10 Deep Learning Methods AI Practitioners Need to Apply

MNIST Handwritten Digits Classification Using a Convolutional Neural Network

Exploring the magic of Neural Networks

Hyperparameter Tuning: The Dexter and Didi Approach to Machine Learning

Unveiling Uncertainty: A Journey into Bayesian Deep Learning and Neural Network Uncertainty

Feature Interaction Detection with SHAP: Revealing Complex Dependencies Between Variables

Deep Learning: From Cats to Encryption to Skynet