How To Use RNNs For Effective Financial Risk Modeling and Stock Forecasting
https://lucenaresearch.com/

How To Use RNNs For Effective Financial Risk Modeling and Stock Forecasting

Through our research at Lucena, we know it’s important to configure deep net infrastructure to accommodate time series data as a trend formation vs. a single point-in-time. There is a vast difference in how us as humans make decisions versus machine learning. Neuroscientists have been debating the similarities/differences between how our brains operate vs. deep learning and although we use similar terms (neurons and tensors) the differences are quite vast. Its quite remarkable to see how babies are able to learn to distinguish between cats and dogs with just a few observations while it takes a fairly robust infrastructure with many layers of artificial neurons and many thousands of observations to train a DNN (deep neural network model). A fascinating presentation on the subject matter by Professor Yann Lecun can be found here: https://youtu.be/WUZhLzaD3b8

Some decisions rely on a static state (image classification, for example). When we feed an image to a network, it mainly relies on its final state. In other words, we don't really care how the image was formed over time to its final state whether the image contains a cat or a dog. In contrast when we evaluate stock price action classification, the time series historical trend formation is extremely valuable.

 Static vs. Time Series

Forecasting a stock price based on a data pattern is normally predicated on some form of historical context. In order to underscore that static vs. time formation context, let’s take for example, the autocomplete prediction on our smartphones. Autocomplete is a function of memorizing the previous sequence of letters in a word or the previous sequence of words in a sentence. A sentence that starts with “The sky is____.” would predict with high confidence the next word to be “blue”. But if we only gave the deep network the last word “is,” it would have no relevant information based on which to discern what comes next.

The very same concept applies to stocks’ data. A traditional artificial neural network may learn to forecast the price of a stock based on several factors:

  • Daily Volume
  • Price to Earnings Ratio (PE)
  • Analyst Recommendations Consensus

While the future price of the stock may heavily depend on these factors, their static values at a point-in-time only tells part of the story. A much richer approach to forecasting a stock price would be to determine how the trend of the above factors formed over time.

In other words, let the network try a bunch of timeframes and determine which one provides the highest statistical significance of future trends. Hyper-parameters such as which lookback (training period) and for how long we want to forecast into the future can be tuned during a cross validation period. 

However, the more parameters you add to the grid search, the more susceptible you are to overfitting. Not to mention a trend’s time frame is not necessarily a constant. In some cases, a trend of 21 days is more predictive while in other cases 63 days maybe more suitable.

Time Series Data and RNNs to Forecast Stock Prices

RNN (Recurrent Neural Network) is a deep neural network designed specifically to tackle this kind of problem. It is able to determine on the fly what type of historical information should be considered or discarded for a high probability classification.

What is a Recurrent Neural Network (RNN)?

A Recurrent Neural Network (RNN) is a deep learning algorithm that operates on sequences (like sequences of characters). At every step, it takes a snapshot before it tries to determine what’s next. In other words, it operates on trend representations via matrixes of historical states.

 RNNs have some form of internal memory, so it remembers what it saw previously. In contrast to the fully connected neural nets and convolutional neural nets (which are feed-forward through which a layer of neurons is used as input adjacent and subsequent layer in the hierarchy). RNNs can use the output of a neuron as an input to the very same neuron.

 A diagram representing a recurrent neuron X(t) is an input neuron. A is the class that determines what information to preserve and what to discard and h(t) is the output neuron which feeds back into the network. Credit: https://colah.github.io/posts/2015-08-Understanding-LSTMs/ 

This diagram can be greatly simplified by unfolding (unrolling) the recurrent instances as follows:


An unrolled recurrent neuron A. Credit: https://colah.github.io/posts/2015-08-Understanding-LSTMs/ 

Not much different than a normal feed-forward network, with one exception: The RNN is able to determine dynamically how deep the network ought to be. In the context of stocks’ historical feature values, consider each vertical formation of x-0 to h-0 a historical snapshot of a state (PE ratio today, PE ratio yesterday, etc…). 

Taking The Concept Of RNN One Step Further LSTM (Long/Short Term Memory)

Long Short-Term Memory networks “LSTMs” are a special kind of RNN, capable of learning long-term dependencies. All RNNs have some form of repeating network structure. In a standard RNN the repeating infrastructure is rather straightforward with a single activation function into an output layer of neurons. In contrast LSTMs contain a more robust infrastructure designed specifically to determine which information ought to be preserved or discarded. Common to LSTM infrastructures is a Cell State layer also called a “conveyor belt”. 

A cell state/conveyor belt of the RNN infrastructure, tasked with determining which information should be preserved or discarded. 

 Instead of having a single neural network layer as in a typical RNN, LSTM holds multiple components, tasked with discriminately adding or removing information to be passed through the “conveyor belt”. 


A typical LSTM Cell holding four components tasked with determining which information is discarded, added and outputted to the cell state layer (conveyor belt). 

I will not get too deep into the inner structure of an LSTM cell, but it’s important to note that the optionality of letting information flow through is managed by three gates using a non-linear activation such as Sigmoid or tanh functions: 

  1. Forget Gate
  2. Input Gate 
  3. Output Gate 

Under the hood RNNs and LSTMs are not much different than a typical multi-layered neural net. The activation functions force the cell’s outcome to conform to a nonlinear representation, Sigmoid to a value between 0 and 1 and tanh to a value between -1 and 1. This is done mainly to enable the typical deep net’s error-minimization discovery through back propagation and gradient descent.


Key Takeaways About Time Series Data and RNNs:

LSTM cells effectively learn to memorize long-term dependencies and perform well. To the untrained eye, the results may seem somewhat incredible or even magical.

One drawback of RNNs and in particular LSTMs is how taxing they are from a computational resources demand perspective. RNNs can be difficult to train and require deep neural network expertise but are a perfect match for time series data as as they can “learn” how to take advantage of sequential signals vs. one time snapshots.

 At Lucena we have spent significant efforts on a robust GPUs infrastructure and are extending our AI libraries with new offerings powered by the very same technology. 

As always, I welcome your feedback and thoughts! 

Gilles Daquin

AI Agents Development

6 年

In what way is it effective?... Like electricity being "effective" for computing? Yes, you can use RNNs but achieving "effectiveness" is a very complicated topic that I am sure a lot of people would actually being interested to read about with bated breath

回复
Syed Danish Ali, CSPA

Actuarial Professional, Data Scientist, Futurist

6 年

Deep reinforcement learning is a great step towards achieving general AI someday

要查看或添加评论,请登录

Erez Katz的更多文章

社区洞察

其他会员也浏览了