Online Learning (by Machines)
Gopi Krishna Suvanam
Entrepreneur | Author | AI & Decentralization proponent | Alumnus of IIT-M & IIM-A
Most ML methods are applied in a batch process mode. So for example in a simple ML model let’s say we want to learn frequencies of words (for may be word prediction), then a list of corpus is made and frequencies are counted based on that.
This is not computationally efficient as we will have to run the whole learning model for even incremental updates of data. Alternatively we can look at model where we store count for each word and total word count separately. So when a new word comes to the machine:
This is a simplistic example, but this can be extended to any type of ML algo including:
1. Exponentially weighted moving averages (EWMA)
2. Gradient descents
3. Deep learning etc..
The obvious advantage is the computational efficiency. But more than that if you incorporate a memory window, for example like in an exponentially moving weighted moving average (EWMA) we can see much more interesting outcomes. The word counting case in EWMA, the online learning algorithm becomes:
Here α is the “memory” parameter. It lies between 0 and 1. If we take high α, the model has very short term memory. If we take low α, the model has very long term memory. This kind of behaviour is new to online learning and may not be present in regular batch processing algorithms. An application of this model in deep learning space is the famous LSTM (long-short-term-memory) networks. Another very popular application of online learning is Kalman Filter.