Data Science with Machine Learning - Stock Price Prediction Part 1
The Latest flagship from my own digital lab is not a new thing to the world; but the real fun is the combination of data science with Machine Learning to obtain an output which is so scalable onto all spheres of knowledge. Machine Learning is a subset of Artificial Intelligence and one particular case that has really caught my attention is the ability to make out meaningful patterns out of unstructured and complex data sets.
We've been always taught in computer science fundamentals that any output from an information system is the application of human created rules applied to data to solve a specific problem. In this case, the scenario is reversed; it's the art of using data and solutions to derive the rules behind a problem.
So, let's jump straight to the real data sets that i've used. I have used Apple's closing stock price data from Jan 2011 to Jan 2021 [APPL as listed in NASDAQ]. The data sets have been classified accordingly using decision boundaries as deemed by myself. This is very dynamic and may be set as per the requirement of the one building the Machine Learning solution. The good thing about stock pricing is that we may get lots of data about prices, fundamentals, global macroeconomic indicators, volatility indices, etc online to use.
Once the data sets are collected and classified, the next step is to train the data. This is a crucial stage where by the Machine Learning algorithm actually learns the data. There are several ways to actually train a machine [naive-bayes, neural network, regression, supported vector machines etc..]; and in my case, i have used a supervised learning algorithm termed as Linear Regression. This algorithm gave me a probability of 99.997% accuracy whilst training the available datasets of Apple closing stock prices.
Below Table is a snapshot of my machine learning solution prediction price vis a vis the real closing stock price of Apple Inc as at 26 Feb 2021.
Now, after the training of our data sets, the next step is to test and evaluate the trained data. For this one, i used the ratio of 80% of trained data for testing and the remaining 20% for validation. This is a crucial step in finalizing any predictive machine learning tool as it helps to provide an unbiased evaluation of the final model fit on the training dataset.
Finally, after the testing and validation of my trained datasets, using the linear regression algorithm, my machine learning code is being coded to output me the Apple Stock Pricing up to May 2021. I shall demonstrate the predictive analysis breakdown of the stock time series data in part 2 of this blog. Stay tuned!
Atishay Sookun - 28 Feb 2021
Head Investor Services Operations and Projects
4 年Atishay..we need to speak. I am interested in what you are doing and eager to see the forthcoming results. I have been studying jump diffusion models to predict time series and derivatives pricing in early days. I can recognize talent and great mind bro. Keep up the research and let's talk.. cheers
Credit Risk & Monitoring Lead at SBM Bank (Mauritius) Ltd
4 年Interesting read. U can get into model coding??
Banking and Operations; Payment Card Specialist; Strategic & Innovative
4 年Cant wait to see the predictive results