The true nature of Deep Learning is Geometry

The true nature of Deep Learning is Geometry

For those who really want to understand what we can expect from Deep Learning from what we can't, it is fundamental to realize that neural network models are in fact just geometric functions.

As summarized by Stéphane Mallat (ENS, CNRS) in his paper [1] about the mathematical principles behind the Deep Convolutional Networks: "Multilayer neural networks are computational learning architectures that propagate the input data across a sequence of linear operators and simple nonlinearities."

In other words, mathematically a Deep Learning model is a kind of recursive generalization of two well known mathematical tricks for solving problems:

  1. extend the dimension of the input data space
  2. apply a non linear operator and expect the problem to become linear in this extra dimension

More, as stated by Fran?ois Chollet (Deep learning @google, Author of Keras) in a recent extract of his book [2], because of the training process, which consists in a gradient descent (or equivalent), we absolutely need each layer (from the input data space to the output space including all the internal ones) to be a geometrical space continuous and smooth enough in order to compute derivatives. Globally, it means that Deep Learning models are limited by nature to continuous and differentiable geometric morphing of the input data manifold.

Shortly, Deep Learning is very efficient to smoothly map space X to space Y (ex. pattern recognition, classification etc...) but Deep Learning cannot bring solution to non-differentiable problem on its own.

As described by M. Chollet in [3], that is the reason why "programs that are capable of basic forms of reasoning are all hard-coded by human programmers: for instance, software that relies on search algorithms, graph manipulation, formal logic. In DeepMind's AlphaGo, for example, most of the "intelligence" on display is designed and hard-coded by expert programmers (e.g. Monte-Carlo tree search); learning from data only happens in specialized submodules".

According to him, the future is then to combine a set of well defined "algorithmic modules providing formal reasoning, search, and abstraction capabilities, with geometric modules providing informal intuition and pattern recognition capabilities" as building blocks in a new class of more expressive models.

References:

[1] Mallat Stéphane (ENS, CNRS), "Understanding deep convolutional networks.", 2016 - Phil.Trans.R.Soc. A 374:20150203.

[2] Chollet Fran?ois (Deep learning @google, Author of Keras), "The Limitations of deep learning".

[3] Chollet Fran?ois (Deep learning @google, Author of Keras), "The future of deep learning".

要查看或添加评论,请登录

Frank DA的更多文章

社区洞察

其他会员也浏览了