Extract stock price data with Python
Photo by Behnam Norouzi on Unsplash Template by Canva

Extract stock price data with Python

Python stock price analysis 01: Get stock price data online

Introduction

Stock markets from all over the world generate a lot of data every day. The volatility in the stock market and the complicated rules of trading and investing makes it a good place for data analysis to be in good used. But before we can analyze any data in the stock market, we need a simple efficient way to extract the massive data generated every day.

In today’s tutorial, I will teach you how to extract everyday stock prices for the stock you want.

Stock Price API

We need to first connect to the API.

No alt text provided for this image

We know that we need to get the data somewhere online but there must be someone providing this service out there.?RapidAPI?is one of them. I don’t know if it is good enough to tell you what it does but I used it to extract stock price data. Feel free to explore more on its website because it provides more than just stock prices API.

Anyway, once you understand that we’re connecting to somebody’s server that opens up this data for us, you can click on?this link?to get to where we can extract the data.

Choose an endpoint

We’re going to extract TIME_SERIES_DAILY_ADJUSTED data.

No alt text provided for this image

Select your code

Since we’re using python, we’re going to choose python code in the code snippet section.

No alt text provided for this image

Note that we can input the ticker symbol of our choice. But I’ll stick to the default MSFT ticker for now. I’ll explain later.

But we’re going to change the outputsize parameter to “full”. The default value “compact” will only return the latest 100 data points. This is useful when you want to update the data every 100 trading days but not when we want the full data.

No alt text provided for this image

Then click on Test Endpoint. This should automatically show a sample result after you run the code. make sure to check if the number of keys in the Time Series tab is more than 100 keys.

1 key should belong to the 1-day stock price.

No alt text provided for this image

Now we can copy the code into our python code.

No alt text provided for this image

Run the code in Jupyter notebook

Running the code in your python IDE should get you a JSON output file.

No alt text provided for this image

Export as a JSON file

The first thing we’ll do is export as a JSON file in case we mess up with the data along the way.

Define a path to store our JSON file. Remember to create a folder named “JSON” in the same directory as your python code.

import os
filepath = f"{os.getcwd()}/JSON"
path = os.path.abspath(f"{filepath}").replace("\\","/")
path        

Then write the extracted data to a JSON file.

import json

with open(f"{path}/MSFT_Stock_Price.json", "w") as outfile:
    json.dump(response.json(), outfile)        

Try to read from the JSON file.

with open(f'{path}/MSFT_Stock_Price.json', 'r') as openfile:
    # Reading from json file
    json_object = json.load(openfile)
json_object        

Output:

No alt text provided for this image

Data Understanding

# Now json_object is a dictionary
type(json_object)        
No alt text provided for this image

json_object is a dictionary that contains information about the stock price at a different level.

Check unique keys in the json_object dictionary.

for key,values in json_object.items():
    print(key)        
No alt text provided for this image

This result tells us that the data that we get is a multi-level dictionary with 2 main keys “Meta Data” and “Time Series (Daily)”. “Time Series (Daily)” is where it stores our daily stock price data, also in the form of a dictionary.

Printing our data

for i in json_object["Time Series (Daily)"].items():
    print(i)        
No alt text provided for this image

Check date range

for key,values in json_object["Time Series (Daily)"].items():
    print(key)        

Convert into a Data Frame

import pandas as pd
stock_price = json_object["Time Series (Daily)"]
df = pd.DataFrame.from_dict({(i): stock_price[i] 
                           for i in stock_price.keys()},
                       orient='index')
df_date = df.reset_index(names="Date")
df_date        

Remember that json_object is a multi-level dictionary where “Time Series (Daily)” is a dictionary itself. We can use pd.DataFrame to convert a dictionary into a data frame but for a multi-level dictionary, we need to use a list comprehension to loop each key in the first level, then convert the subsequent level into a data frame.

Then the dates in the first level are used as an index of each row. To change the index into a column, we use reset_index(). This will change the table index into a number sequence and the original index (date) will be pushed to a column.

Output:

No alt text provided for this image

Export data to a file

Remember to export this data to a file for your usage

filepath = f"{os.getcwd()}/Output"
path = os.path.abspath(f"{filepath}").replace("\\","/")
df_date.to_csv(f'{path}/MSFT_Output.csv')        

Conclusion

That’s all for today’s article. I would like to split this tutorial into multiple parts so that every article is not too long to read and easy to focus on one topic in each article. In my next article, I will write about how you can modify the API so that you can specify which stock(s) price data you want to extract.

Video Tutorial

So, stay tuned if you’re interested to learn more about this topic. See you!

No alt text provided for this image
Photo by ???? Janko Ferli? on Unsplash

About Me

Currently working as a Data Scientist. I provide consultancy, training, and professional services for data analytics problems to my clients worldwide. Would love to share my experience as a consultant so that everyone can learn something from it.

LinkedIn:?https://www.dhirubhai.net/in/foocheechuan/

Medium:?medium.com/@foocheechuan

Youtube:?Chee-Chuan

要查看或添加评论,请登录

Chee-Chuan Foo的更多文章

社区洞察

其他会员也浏览了