Extract stock price data with Python
Python stock price analysis 01: Get stock price data online
Introduction
Stock markets from all over the world generate a lot of data every day. The volatility in the stock market and the complicated rules of trading and investing makes it a good place for data analysis to be in good used. But before we can analyze any data in the stock market, we need a simple efficient way to extract the massive data generated every day.
In today’s tutorial, I will teach you how to extract everyday stock prices for the stock you want.
Stock Price API
We need to first connect to the API.
We know that we need to get the data somewhere online but there must be someone providing this service out there.?RapidAPI?is one of them. I don’t know if it is good enough to tell you what it does but I used it to extract stock price data. Feel free to explore more on its website because it provides more than just stock prices API.
Anyway, once you understand that we’re connecting to somebody’s server that opens up this data for us, you can click on?this link?to get to where we can extract the data.
Choose an endpoint
We’re going to extract TIME_SERIES_DAILY_ADJUSTED data.
Select your code
Since we’re using python, we’re going to choose python code in the code snippet section.
Note that we can input the ticker symbol of our choice. But I’ll stick to the default MSFT ticker for now. I’ll explain later.
But we’re going to change the outputsize parameter to “full”. The default value “compact” will only return the latest 100 data points. This is useful when you want to update the data every 100 trading days but not when we want the full data.
Then click on Test Endpoint. This should automatically show a sample result after you run the code. make sure to check if the number of keys in the Time Series tab is more than 100 keys.
1 key should belong to the 1-day stock price.
Now we can copy the code into our python code.
Run the code in Jupyter notebook
Running the code in your python IDE should get you a JSON output file.
Export as a JSON file
The first thing we’ll do is export as a JSON file in case we mess up with the data along the way.
Define a path to store our JSON file. Remember to create a folder named “JSON” in the same directory as your python code.
import os
filepath = f"{os.getcwd()}/JSON"
path = os.path.abspath(f"{filepath}").replace("\\","/")
path
Then write the extracted data to a JSON file.
import json
with open(f"{path}/MSFT_Stock_Price.json", "w") as outfile:
json.dump(response.json(), outfile)
Try to read from the JSON file.
with open(f'{path}/MSFT_Stock_Price.json', 'r') as openfile:
# Reading from json file
json_object = json.load(openfile)
json_object
领英推荐
Output:
Data Understanding
# Now json_object is a dictionary
type(json_object)
json_object is a dictionary that contains information about the stock price at a different level.
Check unique keys in the json_object dictionary.
for key,values in json_object.items():
print(key)
This result tells us that the data that we get is a multi-level dictionary with 2 main keys “Meta Data” and “Time Series (Daily)”. “Time Series (Daily)” is where it stores our daily stock price data, also in the form of a dictionary.
Printing our data
for i in json_object["Time Series (Daily)"].items():
print(i)
Check date range
for key,values in json_object["Time Series (Daily)"].items():
print(key)
Convert into a Data Frame
import pandas as pd
stock_price = json_object["Time Series (Daily)"]
df = pd.DataFrame.from_dict({(i): stock_price[i]
for i in stock_price.keys()},
orient='index')
df_date = df.reset_index(names="Date")
df_date
Remember that json_object is a multi-level dictionary where “Time Series (Daily)” is a dictionary itself. We can use pd.DataFrame to convert a dictionary into a data frame but for a multi-level dictionary, we need to use a list comprehension to loop each key in the first level, then convert the subsequent level into a data frame.
Then the dates in the first level are used as an index of each row. To change the index into a column, we use reset_index(). This will change the table index into a number sequence and the original index (date) will be pushed to a column.
Output:
Export data to a file
Remember to export this data to a file for your usage
filepath = f"{os.getcwd()}/Output"
path = os.path.abspath(f"{filepath}").replace("\\","/")
df_date.to_csv(f'{path}/MSFT_Output.csv')
Conclusion
That’s all for today’s article. I would like to split this tutorial into multiple parts so that every article is not too long to read and easy to focus on one topic in each article. In my next article, I will write about how you can modify the API so that you can specify which stock(s) price data you want to extract.
Video Tutorial
So, stay tuned if you’re interested to learn more about this topic. See you!
About Me
Currently working as a Data Scientist. I provide consultancy, training, and professional services for data analytics problems to my clients worldwide. Would love to share my experience as a consultant so that everyone can learn something from it.
Medium:?medium.com/@foocheechuan
Youtube:?Chee-Chuan