Deep learning: a new revolution has started that resembles how our brain works

Introduction

Since 2012, deep learning is causing a great revolt in several fields: medicine, finance, manufacturing, arts, etc. According to DARPA these networks that can be classified in two: deep learning for data analysis (i.e. supervised, unsupervised classification/prediction) and deep learning for operations (i.e. deep reinforcement learning used in Alpha Zero), covers a large range of algorithms.

No alt text provided for this image

Figure 1: neural network zoo (source: https://www.asimovinstitute.org/neural-network-zoo/)

 But the common factor among those algorithms is that they are based in discrete number of neurons and layers. See figure 1 for a comprehensive list of discrete deep learning networks. But this has already changed.

 

Introducing neural ordinary differential equations (ODEs)

In December 2018, at NEURIPS conference, one of the biggest conferences of artificial intelligence in the world, out of more than 4000 papers, one shined as the best one: Neural Ordinary Differential Equations (ODEs) [1]. Its idea is revolutionary and here we explain why.

The authors of ODEs argues that, since the introduction of RESNET (residual networks) by Microsoft in 2015 [2], they realized that this network outperformed others, just with the simple idea of skipping neural layers. See figure 2 for understanding RESNET architecture.

No alt text provided for this image

Figure 2: RESNET architecture at the neural node level

 The reason behind this is that each layer introduces errors, so it makes sense to skip them. But RESNET are still discrete networks. So, it made sense to ask a simple question: what if we get rid of all the layers and consider a deep neural network as a continuous function, where we can approximate a neuron as a derivate in any point of the curve?

Then, the authors of the ODE paper realized that RESNET functions resembled Eulers method for ordinary differential equations. That Euler formula is expressed as follows:

No alt text provided for this image

Figura 3: Euler discretization of a continuous transformation

 

Evidence in Auto Machine Learning (AutoML)

On the other hand, Auto Machine Learning that was launched as a service in Google IO 2019, corroborates this fact. It happens that when the AutoML algorithm tries to decide which is the best topology for a convolutional neural network, it delivers a network similar to RESNET, because it tries to skip layers in order to reduce the errors introduced in each layer. See Figure 4 for the details of the convolutional network discovered automatically by AutoML.

No alt text provided for this image

Figure 4: Convolutional architecture discovered by Google AutoML (Source: Google IO 2019)

Moreover, it is important to note that nowadays AutoML outperforms traditional convolutional networks in terms of accuracy. See figure 5 for more details.

No alt text provided for this image

 Figure 5: AutoML outperforming traditional convolutional networks used for image classification and prediction.

 

The case for ODEs

So, there is mounting evidence that an accurate deep learning network should be continuous instead of discrete. We can say that a deep neural network can be approximated as an ODE function and solved with a traditional ODE solver such as the ones described [3], [4]. This will solve the forward propagation of the network.

In terms of back propagation, we can no longer use gradient descend. Instead we have to use another method named the Adjoint sensitivity method [5] whose representation is depicted in Figure 6.

No alt text provided for this image

Figure 6: Adjoint sensitivity method for solving neural ODE back propagation

 Few articles has been written in this matter, but this is really a breakthrough since ODEs are a big step for implementing a continuous like network that are more similar to our brains than discrete neural networks.

Moreover they are memory efficient, reduce consumption of GPU/TPUs according to the ODE solver used, reduce the need to define hyper parameters (i.e. number of layers) among other benefits.

 Future work

With Neural ordinary differential networks, explainable AI becomes more difficult since now the challenge is to explain how a continuous function arrived to a conclusion.

On the other hand, generative adversarial networks (GANs) could be re implemented using ODEs inside instead of convolutional neural networks (CNNs) for more accurate image generation, classification and prediction. This may have a huge impact in fields such as medicine (i.e. cancer, diabetes prediction), drug discovery, finance (i.e. stock market investments, credit default prediction) among other fields.

 Conclusion

As we have seen in this article, the field is open for a new kind of research and researchers. A new way of thinking deep neural network to make them infinitely deep has been introduced and this may really pave the way to achieving artificial general intelligence (AGI). Only the becoming years will tell us this. But the future is exciting.

 

Bibliography

[1] Tian Qi Chen · Yulia Rubanova · Jesse Bettencourt · David Duvenaud , Neural Ordinary Differential Equations. NEURIPS Dec 2018

[2] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Deep Residual Learning for Image Recognition. Computer Vision and Pattern Recognition. 2015

 [3] C. Runge. über die numerische Aufl?sung von Differentialgleichungen. Mathematische Annalen, 46: 167–178, 1895.

[4] E. Hairer, S.P. N?rsett, and G.Wanner. Solving Ordinary Differential Equations I – Nonstiff Problems. Springer, 1987.

[5] Lev Semenovich Pontryagin, EF Mishchenko, VG Boltyanskii, and RV Gamkrelidze. The mathematical theory of optimal processes. 1962.

 

DR. Marcelo Giovanni Mu?oz Rojas ???? ????

Investigador y Consultor Fortune 500, Speaker TEDx y Autor. Apoyo a empresas en Gobierno Corporativo, Transformación Digital, Liderazgo y Gestión del Cambio, integrando IA para potenciar crecimiento y adaptación ágil.

3 个月

Buen punto Servio. Gracias por compartir

回复

要查看或添加评论,请登录

Servio Lima的更多文章

社区洞察

其他会员也浏览了