Machine Learning and Particle Motion in Liquids: An Elegant Link
Marco Tavora Ph.D.
Marco Tavora Ph.D.
Physicist | Data Scientist | Scientific Writer | Founder at Principia | 45k+ followers
In this article, published originally on Towards Data Science, I argue that by thinking of stochastic gradient descent as a Langevin stochastic process (with an extra level of randomization implemented via the learning rate), one can better understand the reasons why the method works so well as a global optimizer.
Author/trainer/mentor in computational finance: maths (pure, applied, numerical), ODE/PDE/FDM, C++11/C++20, Python, C#, modern software design
5 年A good test would be to take an example where SGD fails (or converges to a local minimum) but where the Langevin SDE does converge to a global minimum. A hard result! There are lots of blogs but fewer ones dealing with numerics. In this thread some of us discuss this problem and propose a solution https://forum.wilmott.com/viewtopic.php?f=34&t=101662
Looking for new opportunities.
5 年There must also be connections with Hamiltonian or Lagrangian Markov Chain Monte Carlo methods.?https://statmodeling.stat.columbia.edu/2014/05/20/thermodynamic-monte-carlo/
Experienced Data & Analytics Professional | Proud father | 50+
5 年Marco Tavora Ph.D., have you thought about reuniting these essays on science + deep learning on a book? It would be great!!
Author/trainer/mentor in computational finance: maths (pure, applied, numerical), ODE/PDE/FDM, C++11/C++20, Python, C#, modern software design
5 年Nice! SGD converges to a local minimum. It is a discrete form of?a gradient system. The Langevin SDE converges to a global solution. Adding relevant references would be nice!