Unlocking the Future of Finance: Deep Learning Models for Time Series Forecasting
Hamed Soleimani
Business Analysis & Intelligence | Financial Business Data Analyst | Requirements Gathering & Stakeholder Engagement | SQL & Power BI Expertise
In my previous article, ‘Predicting Market Moves: Leveraging Time Series to Guide Investment in Alphabet’, we explored statistical models like ARIMA and ARIMA-GARCH for forecasting Alphabet's stock prices. While these models have their merits, rapid advancements in computational power and machine learning have opened new avenues for more sophisticated forecasting techniques. Today, we delve into deep learning models—specifically Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM) networks, and Gated Recurrent Units (GRUs)—and examine how they stack up against traditional methods.
Deep Learning Models for Time Series Forecasting
Recurrent Neural Networks (RNNs)
Recurrent Neural Networks (RNNs) are specialised artificial neural networks designed for processing sequential data (Guo et al., 2012). Unlike traditional neural networks that handle inputs in a single forward pass, RNNs incorporate a feedback mechanism where the output from one step becomes the input for the next. This creates an internal memory that captures temporal dependencies, allowing the network to learn from previous inputs (Ayodele et al., 2021).
While RNNs are effective for sequence prediction by combining current inputs with prior information, they struggle with long sequences due to the vanishing gradient problem. This issue causes the influence of earlier inputs to diminish over time, limiting the network's ability to learn dependencies between distant elements in a sequence (Liu et al., 2023).
Long Short-Term Memory (LSTM) Networks
To overcome the limitations of traditional RNNs, Long Short-Term Memory (LSTM) networks were introduced (Hochreiter & Schmidhuber, 1997). LSTMs excel at learning long-term dependencies in sequences by utilising a unique architecture with gating mechanisms that regulate information flow (Zhang et al., 2023).
An LSTM cell comprises:
These gates use activation functions to compute values between 0 and 1, effectively managing what information is retained or forgotten (Gao et al., 2017). LSTMs are particularly effective in handling financial time series data due to their ability to capture long-term dependencies. However, challenges such as non-stationary data, noise, sudden changes, and extended prediction horizons can impact their performance (Fang et al., 2023).
Gated Recurrent Units (GRUs)
Gated Recurrent Units (GRUs), proposed by Cho et al. (2014), offer a streamlined alternative to LSTMs with a simplified architecture that enhances computational efficiency (Pil-Soo et al., 2018). GRUs combine the forget and input gates into a single update gate and eliminate the separate memory cell, modulating information flow directly through the hidden state (ArunKumar et al., 2021).
This simplification results in faster training times while maintaining performance comparable to LSTMs. The update gate controls how much of the previous hidden state is retained versus how much new information is added from the current input, allowing GRUs to effectively learn long-term dependencies (Wang et al., 2019).
However, a potential drawback of GRUs is the absence of a dedicated forget gate. While LSTMs can explicitly discard irrelevant information, GRUs rely solely on the update gate to manage information flow, which might limit their ability to handle tasks requiring selective forgetting (Srivatsavaya, 2023).
Comparison and Evaluation of Time Series Models
Advancements in computational power have paved the way for machine learning techniques like RNNs, LSTMs, and GRUs in financial forecasting (Selvin et al., 2017). These models can handle non-linear relationships and do not require data to be stationary. Deep learning models, particularly LSTMs, have shown promise due to their ability to learn and remember long-term dependencies (Beniwal et al., 2024).
Comparative studies have yielded mixed results. Siami-Namini et al. (2018) found that LSTMs outperformed ARIMA significantly in predicting indices like NASDAQ, reducing error rates by 84%–87%. Conversely, Kobiela et al. (2022) observed that ARIMA outperformed LSTM when predicting NASDAQ data using fewer features.
These contrasting outcomes highlight the importance of considering factors like target population, dataset characteristics, and prediction windows when selecting a model for stock price prediction.
Model Tuning and Implementation
To optimise these models, we leveraged an automated hyperparameter tuning function with 50 iterations. This function automatically explored different combinations of hyperparameters and selected the best set that minimised the error metrics (MAE and RMSE).
Python Libraries Used
We utilised the following Python libraries to implement and forecast the models:
Case Study: Forecasting Alphabet's Stock Price
Hyperparameter Tuning Results
Before presenting the comparative results, here are the hyperparameters obtained through tuning:
Comparative Performance Metrics
领英推荐
Interpreting the Results
The advanced recurrent neural network models exhibit substantially lower RMSE and MAE compared to statistical models, indicating higher accuracy in predicting the time series. This suggests they are better suited for capturing complex, non-linear relationships in the data.
Among the deep learning models, the GRU model shows the most promising results, outperforming baseline models and effectively balancing complexity and predictive power. While the RNN achieved the lowest validation loss, the GRU's superior RMSE and MAE make it the best overall performer.
Forecasting Performance for Mid-Term Unseen Data
To further assess the models, we compared the final forecasts generated by the GRU model (representing deep learning models) and the ARIMA-GARCH model (the best statistical method). We focused on a mid-term forecast horizon covering three calendar months—March to May 2024—amounting to 63 business days.
Summary of Real and Predicted Values for Alphabet
Comparative Error Metrics
A visual inspection and the error metrics clearly indicate that the GRU model significantly outperforms the ARIMA-GARCH model. This aligns with the findings of Siami-Namini et al. (2018), who highlighted the superiority of LSTMs over ARIMA models. The detailed statistics demonstrate the GRU model's ability to predict values closer to the actual closing prices.
Conclusion: The Future is Deep Learning
Our exploration into deep learning models for time series forecasting reveals that advanced neural networks like GRUs offer superior accuracy over traditional statistical models. The GRU model's ability to capture complex, non-linear relationships makes it particularly effective for stock price prediction.
Key Takeaways:
Looking Ahead: Embracing Deep Learning for Strategic Advantage
The superior performance of deep learning models has profound implications for investors and businesses. By leveraging these advanced techniques, stakeholders can:
References
ArunKumar, A., Kalaga, V., CMS, K., Kawaji, M., Brenza, T. (2021) ‘Forecasting of COVID-19 using deep layer Recurrent Neural Networks (RNNs) with Gated Recurrent Units (GRUs) and Long Short-Term Memory (LSTM) cells.’ Chaos, Solitons & Fractals, Volume 146, 110861. [Online] [Accessed on 9th May 2024] DOI: https://doi.org/10.1016/j.chaos.2021.110861
Ayodele, E., Zaidi S., Zhang, Z., Scott, J., McLernon, D. (2021) ‘Chapter 9 - A review of deep learning approaches in glove-based gesture classification’?In Intelligent Data-Centric Systems,Machine Learning, Big Data, and IoT for Medical Informatics, Academic Press, Pages 143-164,. ?[Online] [Accessed on 7th May 2024] DOI: https://doi.org/10.1016/B978-0-12-821777-1.00012-4
Beniwal, M., Singh, A., Kumar, N. (2024) ‘Forecasting multistep daily stock prices for long-term investment decisions: A study of deep learning models on global indices’ Engineering Applications of Artificial Intelligence, Volume 129, 107617. [Online] [Accessed on 9th May 2024] DOI: https://doi.org/10.1016/j.engappai.2023.107617
Fang, Z., Ma, X., Pan, H., Yang, G., Arce, G. (2023) ‘Movement forecasting of financial time series based on adaptive LSTM-BN network’ Expert Systems with Applications, Volume 213, Part C, 119207. [Online] [Accessed on 8th May 2024] DOI: https://doi.org/10.1016/j.eswa.2022.119207
Gao, T., Chai, Y., Liu, Y. (2017) ‘Applying long short term memory neural networks for predicting stock closing price’ IEEE International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, 2017, pp. 575-578. [Online] [Accessed on 8th May 2024] DOI: 10.1109/ICSESS.2017.8342981
Guo D., Zhang Y. (2012) ‘Novel recurrent neural network for time-varying problems solving’ IEEE Computational Intelligence Magazine, 7 (4) (2012), pp. 61-65. [Online] [Accessed on 7th May 2024] DOI: 10.1109/MCI.2012.2215139
Hochreiter, S., Schmidhuber, J. (1997) ‘Long Short-Term Memory’ in Neural Computation, vol. 9, no. 8, pp. 1735-1780. [Online] [Accessed on 8th May 2024] DOI:? 10.1162/neco.1997.9.8.1735
Jing, C., Xinyu, H., Hao, J., Xiren, M. (2021) ‘Low-Cost and Device-Free Human Activity Recognition Based on Hierarchical Learning Model’. Sensors, 21. 2359. [Online] [Accessed on 8th May 2024] DOI: 10.3390/s21072359
Kobiela, D., Krefta, D., Król, W., Weichbroth, P. (2022) ‘ARIMA vs LSTM on NASDAQ stock exchange data’ Procedia Computer Science, Volume 207, Pages 3836-3845. [Online] [Accessed on 9th May 2024] DOI: https://doi.org/10.1016/j.procs.2022.09.445
Liu, X., Du, H., Yu J. (2023) ‘A forecasting method for non-equal interval time series based on recurrent neural network’ Neurocomputing, Volume 556, 126648. [Online] [Accessed on 7th May 2024] DOI: https://doi.org/10.1016/j.neucom.2023.126648
Pil-Soo, K., Dong-Gyu, L., Seong-Whan, L. (2018) ‘Discriminative context learning with gated recurrent unit for group activity recognition’ Pattern Recognition,Volume 76, Pages 149-161,. [Online] [Accessed on 9th May 2024] DOI: https://doi.org/10.1016/j.patcog.2017.10.037
Selvin, S., Vinayakumar, R., Gopalakrishnan, E., Menon, V., Oman, K. (2017) ‘Stock price prediction using LSTM, RNN and CNN-sliding window model’ Proceedings of the 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Udupi, India, 13, pp. 1643{1647}. [Online] [Accessed on 26th April 2024] DOI: 10.1109/ICACCI.2017.8126078
Siami-Namini, S., Tavakoli, N., Siami Namin, A. (2018) ‘A Comparison of ARIMA and LSTM in Forecasting Time Series’ 17th IEEE International Conference on Machine Learning and Applications (ICMLA), Orlando, FL, USA, pp. 1394-1401. [Online] [Accessed on 26th April 2024] DOI: 10.1109/ICMLA.2018.00227
Srivatsavaya, P., (2023) LSTM vs GRU. Medium [Online] [Accessed on 9th May 2024] https://medium.com/@prudhviraju.srivatsavaya/lstm-vs-gru-c1209b8ecb5a
Wang J., Yan J., Li C., Gao R., Zhao R. (2019) ‘Deep heterogeneous GRU model for predictive analytics in smart manufacturing: Application to tool wear prediction.’ Computers in Industry 111: 1–14. [Online] [Accessed on 9th May 2024] DOI: https://doi.org/10.1016/j.compind.2019.06.001
Zhang, X., Zhong, C., Zhang, J., Wang, T., Ng, W. (2023) ‘Robust recurrent neural networks for time series forecasting’ Neurocomputing, Volume 526, Pages 143-157. [Online] [Accessed on 8th May 2024] DOI: https://doi.org/10.1016/j.neucom.2023.01.037