Unlocking the Future of Finance: Deep Learning Models for Time Series Forecasting

Unlocking the Future of Finance: Deep Learning Models for Time Series Forecasting

In my previous article, Predicting Market Moves: Leveraging Time Series to Guide Investment in Alphabet, we explored statistical models like ARIMA and ARIMA-GARCH for forecasting Alphabet's stock prices. While these models have their merits, rapid advancements in computational power and machine learning have opened new avenues for more sophisticated forecasting techniques. Today, we delve into deep learning models—specifically Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM) networks, and Gated Recurrent Units (GRUs)—and examine how they stack up against traditional methods.


Deep Learning Models for Time Series Forecasting

Recurrent Neural Networks (RNNs)

Recurrent Neural Networks (RNNs) are specialised artificial neural networks designed for processing sequential data (Guo et al., 2012). Unlike traditional neural networks that handle inputs in a single forward pass, RNNs incorporate a feedback mechanism where the output from one step becomes the input for the next. This creates an internal memory that captures temporal dependencies, allowing the network to learn from previous inputs (Ayodele et al., 2021).

While RNNs are effective for sequence prediction by combining current inputs with prior information, they struggle with long sequences due to the vanishing gradient problem. This issue causes the influence of earlier inputs to diminish over time, limiting the network's ability to learn dependencies between distant elements in a sequence (Liu et al., 2023).


Figure 1: RNN Architecture (Source: Beniwal et al., 2024)


Long Short-Term Memory (LSTM) Networks

To overcome the limitations of traditional RNNs, Long Short-Term Memory (LSTM) networks were introduced (Hochreiter & Schmidhuber, 1997). LSTMs excel at learning long-term dependencies in sequences by utilising a unique architecture with gating mechanisms that regulate information flow (Zhang et al., 2023).

An LSTM cell comprises:

  • Cell State: Acts as a long-term memory unit, storing relevant information.
  • Input Gate: Controls the flow of new information into the cell state.
  • Output Gate: Determines which information from the cell state is used as output.
  • Forget Gate: Decides what information to discard from the cell state over time.

These gates use activation functions to compute values between 0 and 1, effectively managing what information is retained or forgotten (Gao et al., 2017). LSTMs are particularly effective in handling financial time series data due to their ability to capture long-term dependencies. However, challenges such as non-stationary data, noise, sudden changes, and extended prediction horizons can impact their performance (Fang et al., 2023).


Figure 2: LSTM Architecture (Source: Beniwal et al., 2024)


Gated Recurrent Units (GRUs)

Gated Recurrent Units (GRUs), proposed by Cho et al. (2014), offer a streamlined alternative to LSTMs with a simplified architecture that enhances computational efficiency (Pil-Soo et al., 2018). GRUs combine the forget and input gates into a single update gate and eliminate the separate memory cell, modulating information flow directly through the hidden state (ArunKumar et al., 2021).

This simplification results in faster training times while maintaining performance comparable to LSTMs. The update gate controls how much of the previous hidden state is retained versus how much new information is added from the current input, allowing GRUs to effectively learn long-term dependencies (Wang et al., 2019).

However, a potential drawback of GRUs is the absence of a dedicated forget gate. While LSTMs can explicitly discard irrelevant information, GRUs rely solely on the update gate to manage information flow, which might limit their ability to handle tasks requiring selective forgetting (Srivatsavaya, 2023).


Figure 3: GRU Architecture (Source: Jing et al., 2021)


Comparison and Evaluation of Time Series Models

Advancements in computational power have paved the way for machine learning techniques like RNNs, LSTMs, and GRUs in financial forecasting (Selvin et al., 2017). These models can handle non-linear relationships and do not require data to be stationary. Deep learning models, particularly LSTMs, have shown promise due to their ability to learn and remember long-term dependencies (Beniwal et al., 2024).

Comparative studies have yielded mixed results. Siami-Namini et al. (2018) found that LSTMs outperformed ARIMA significantly in predicting indices like NASDAQ, reducing error rates by 84%–87%. Conversely, Kobiela et al. (2022) observed that ARIMA outperformed LSTM when predicting NASDAQ data using fewer features.

These contrasting outcomes highlight the importance of considering factors like target population, dataset characteristics, and prediction windows when selecting a model for stock price prediction.


Model Tuning and Implementation

To optimise these models, we leveraged an automated hyperparameter tuning function with 50 iterations. This function automatically explored different combinations of hyperparameters and selected the best set that minimised the error metrics (MAE and RMSE).

Python Libraries Used

We utilised the following Python libraries to implement and forecast the models:

Table 1: Python Libraries utilised for model implementation


Case Study: Forecasting Alphabet's Stock Price

Hyperparameter Tuning Results

Before presenting the comparative results, here are the hyperparameters obtained through tuning:

Table 2: Tuned Parameters of Forecasting Models for Alphabet Stock Price Prediction


Comparative Performance Metrics

Table 3: Comparative Results of Forecasting Models and Performance Metrics for Alphabet


Figure 4: Comparison of RNN, LSTM, and GRU Model Predictions for Alphabet's Stock Price


Interpreting the Results

The advanced recurrent neural network models exhibit substantially lower RMSE and MAE compared to statistical models, indicating higher accuracy in predicting the time series. This suggests they are better suited for capturing complex, non-linear relationships in the data.

Among the deep learning models, the GRU model shows the most promising results, outperforming baseline models and effectively balancing complexity and predictive power. While the RNN achieved the lowest validation loss, the GRU's superior RMSE and MAE make it the best overall performer.


Forecasting Performance for Mid-Term Unseen Data

To further assess the models, we compared the final forecasts generated by the GRU model (representing deep learning models) and the ARIMA-GARCH model (the best statistical method). We focused on a mid-term forecast horizon covering three calendar months—March to May 2024—amounting to 63 business days.

Figure 5: Mid-term Prediction Chart of Alphabet


Summary of Real and Predicted Values for Alphabet

Table 4: Summary of real and predicted values for Alphabet


Comparative Error Metrics

Table 5: Comparative Table for Mid-term Forecasting of Alphabet


A visual inspection and the error metrics clearly indicate that the GRU model significantly outperforms the ARIMA-GARCH model. This aligns with the findings of Siami-Namini et al. (2018), who highlighted the superiority of LSTMs over ARIMA models. The detailed statistics demonstrate the GRU model's ability to predict values closer to the actual closing prices.


Conclusion: The Future is Deep Learning

Our exploration into deep learning models for time series forecasting reveals that advanced neural networks like GRUs offer superior accuracy over traditional statistical models. The GRU model's ability to capture complex, non-linear relationships makes it particularly effective for stock price prediction.

Key Takeaways:

  • Deep Learning Outperforms Traditional Models: GRU and other recurrent neural networks significantly reduce forecasting errors compared to statistical models like ARIMA-GARCH.
  • Importance of Model Selection: Choosing the right model depends on factors like data characteristics and prediction horizons. Deep learning models excel in capturing non-linear patterns and long-term dependencies.
  • Optimisation is Crucial: Hyperparameter tuning plays a vital role in enhancing model performance. Automated tuning can efficiently identify optimal settings.


Looking Ahead: Embracing Deep Learning for Strategic Advantage

The superior performance of deep learning models has profound implications for investors and businesses. By leveraging these advanced techniques, stakeholders can:

  • Anticipate Market Movements More Accurately: Improved forecasting leads to better investment decisions and portfolio management.
  • Navigate Uncertainty with Confidence: Understanding complex patterns in data helps in risk mitigation and strategic planning.
  • Gain a Competitive Edge: Early adoption of cutting-edge models can differentiate businesses in a rapidly evolving market.


References

ArunKumar, A., Kalaga, V., CMS, K., Kawaji, M., Brenza, T. (2021) ‘Forecasting of COVID-19 using deep layer Recurrent Neural Networks (RNNs) with Gated Recurrent Units (GRUs) and Long Short-Term Memory (LSTM) cells.’ Chaos, Solitons & Fractals, Volume 146, 110861. [Online] [Accessed on 9th May 2024] DOI: https://doi.org/10.1016/j.chaos.2021.110861

Ayodele, E., Zaidi S., Zhang, Z., Scott, J., McLernon, D. (2021) ‘Chapter 9 - A review of deep learning approaches in glove-based gesture classification’?In Intelligent Data-Centric Systems,Machine Learning, Big Data, and IoT for Medical Informatics, Academic Press, Pages 143-164,. ?[Online] [Accessed on 7th May 2024] DOI: https://doi.org/10.1016/B978-0-12-821777-1.00012-4

Beniwal, M., Singh, A., Kumar, N. (2024) ‘Forecasting multistep daily stock prices for long-term investment decisions: A study of deep learning models on global indices’ Engineering Applications of Artificial Intelligence, Volume 129, 107617. [Online] [Accessed on 9th May 2024] DOI: https://doi.org/10.1016/j.engappai.2023.107617

Fang, Z., Ma, X., Pan, H., Yang, G., Arce, G. (2023) ‘Movement forecasting of financial time series based on adaptive LSTM-BN network’ Expert Systems with Applications, Volume 213, Part C, 119207. [Online] [Accessed on 8th May 2024] DOI: https://doi.org/10.1016/j.eswa.2022.119207

Gao, T., Chai, Y., Liu, Y. (2017) ‘Applying long short term memory neural networks for predicting stock closing price’ IEEE International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, 2017, pp. 575-578. [Online] [Accessed on 8th May 2024] DOI: 10.1109/ICSESS.2017.8342981

Guo D., Zhang Y. (2012) ‘Novel recurrent neural network for time-varying problems solving’ IEEE Computational Intelligence Magazine, 7 (4) (2012), pp. 61-65. [Online] [Accessed on 7th May 2024] DOI: 10.1109/MCI.2012.2215139

Hochreiter, S., Schmidhuber, J. (1997) ‘Long Short-Term Memory’ in Neural Computation, vol. 9, no. 8, pp. 1735-1780. [Online] [Accessed on 8th May 2024] DOI:? 10.1162/neco.1997.9.8.1735

Jing, C., Xinyu, H., Hao, J., Xiren, M. (2021) ‘Low-Cost and Device-Free Human Activity Recognition Based on Hierarchical Learning Model’. Sensors, 21. 2359. [Online] [Accessed on 8th May 2024] DOI: 10.3390/s21072359

Kobiela, D., Krefta, D., Król, W., Weichbroth, P. (2022) ‘ARIMA vs LSTM on NASDAQ stock exchange data’ Procedia Computer Science, Volume 207, Pages 3836-3845. [Online] [Accessed on 9th May 2024] DOI: https://doi.org/10.1016/j.procs.2022.09.445

Liu, X., Du, H., Yu J. (2023) ‘A forecasting method for non-equal interval time series based on recurrent neural network’ Neurocomputing, Volume 556, 126648. [Online] [Accessed on 7th May 2024] DOI: https://doi.org/10.1016/j.neucom.2023.126648

Pil-Soo, K., Dong-Gyu, L., Seong-Whan, L. (2018) ‘Discriminative context learning with gated recurrent unit for group activity recognition’ Pattern Recognition,Volume 76, Pages 149-161,. [Online] [Accessed on 9th May 2024] DOI: https://doi.org/10.1016/j.patcog.2017.10.037

Selvin, S., Vinayakumar, R., Gopalakrishnan, E., Menon, V., Oman, K. (2017) ‘Stock price prediction using LSTM, RNN and CNN-sliding window model’ Proceedings of the 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Udupi, India, 13, pp. 1643{1647}. [Online] [Accessed on 26th April 2024] DOI: 10.1109/ICACCI.2017.8126078

Siami-Namini, S., Tavakoli, N., Siami Namin, A. (2018) ‘A Comparison of ARIMA and LSTM in Forecasting Time Series’ 17th IEEE International Conference on Machine Learning and Applications (ICMLA), Orlando, FL, USA, pp. 1394-1401. [Online] [Accessed on 26th April 2024] DOI: 10.1109/ICMLA.2018.00227

Srivatsavaya, P., (2023) LSTM vs GRU. Medium [Online] [Accessed on 9th May 2024] https://medium.com/@prudhviraju.srivatsavaya/lstm-vs-gru-c1209b8ecb5a

Wang J., Yan J., Li C., Gao R., Zhao R. (2019) ‘Deep heterogeneous GRU model for predictive analytics in smart manufacturing: Application to tool wear prediction.’ Computers in Industry 111: 1–14. [Online] [Accessed on 9th May 2024] DOI: https://doi.org/10.1016/j.compind.2019.06.001

Zhang, X., Zhong, C., Zhang, J., Wang, T., Ng, W. (2023) ‘Robust recurrent neural networks for time series forecasting’ Neurocomputing, Volume 526, Pages 143-157. [Online] [Accessed on 8th May 2024] DOI: https://doi.org/10.1016/j.neucom.2023.01.037



要查看或添加评论,请登录

Hamed Soleimani的更多文章

社区洞察

其他会员也浏览了