Analyzing Historical Stock Data with Python and Yahoo Finance

Analyzing Historical Stock Data with Python and Yahoo Finance

The world of finance and investing is filled with data, and one of the most valuable types of data is historical stock prices. Analyzing historical stock data can help investors make informed decisions and gain insights into market trends. In this article, we'll explore a Python script that scrapes data from Yahoo Finance to fetch historical stock data and perform basic data analysis. The API is not supported anymore and yfinance library or other libraries do not always work. So, we scrape the website itself. You just need to get the url of the page you need and use it in your Python code.

Before we dive into the code, you'll need to have Python installed on your system. Additionally, you should install the necessary Python libraries, such as pandas, tqdm, requests, matplotlib, and datetime. You can install these libraries using pip:

pip install yfinance pandas tqdm requests matplotlib

Or you can use google colab if you don’t have python on your computer:

https://colab.research.google.com/#create=true

The Python script consists of several key steps:

1. Importing Necessary Libraries:

?? We start by importing the required Python libraries. These include yfinance for fetching stock data, pandas for data manipulation, tqdm for creating a progress bar, requests for making web requests, warnings to suppress warnings, matplotlib for plotting, and datetime for working with date and time data.

import pandas as pd

from tqdm import tqdm

import requests

import warnings

warnings.filterwarnings(action='ignore')

import matplotlib.pyplot as plt

import datetime

import itertools

?

2. Setting User-Agent Header and Fetching Stock Symbols:

?? We set a User-Agent header to mimic a web browser and send a request to a Yahoo Finance screener URL to retrieve a list of stock symbols. We store this data in a DataFrame. I used the url for screener for megacap US stocks.

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'}

url = 'https://finance.yahoo.com/screener/unsaved/3dfefd5c-0b0f-49f7-bfd6-64021dfe6458?count=250&offset=0'

response = requests.get(url, headers=headers)

stocks = df = pd.read_html(response.text)[0]

?

3. Defining the Date Range:

?? We specify the date range for which we want to retrieve stock data. In the script, we set the date range from June 1, 2023, to October 31, 2023. We will use the timestamps in the url to fetch historical data.

date_string1 = '2023-06-01'

date_string2 = '2023-10-31'

?

date_object1 = datetime.datetime.strptime(date_string1, '%Y-%m-%d')

date_object2 = datetime.datetime.strptime(date_string2, '%Y-%m-%d')

timestamp1 = int(date_object1.timestamp())

timestamp2 = int(date_object2.timestamp())

?

4. Fetching Stock Data for Each Symbol:

?? We loop through the stock symbols and fetch historical stock data from Yahoo Finance for each stock within the specified date range. The data is cleaned, converted to a DataFrame, and stored in a list for later processing.

all=[]

for stock in tqdm(stocks['Symbol']):

??? url = f'https://finance.yahoo.com/quote/'+stock+f'/history?period1={timestamp1}&period2={timestamp2}&interval=1d&frequency=1d'

??? response = requests.get(url, headers=headers)

??? df = pd.read_html(response.text)[0]

??? df.set_index('Date', inplace=True)

??? df = df[~df.index.str.startswith('*')]

??? df.index = pd.to_datetime(df.index, format='%b %d, %Y')

??? df = df.apply(pd.to_numeric, errors='coerce')

??? df = df.sort_values(by='Date')???

??? df.dropna(inplace=True)

??? all.append([stock, df])

?

5. Plotting Stock Price Data:

?? We create a plot to visualize the stock price data for the first 5 stocks in the list. This step demonstrates how to use Matplotlib to visualize stock price performance over time.

plt.figure()

for [stock, df] in all[:5]:

??? plt.plot(df['Close*'] / df['Close*'].iloc[0], label=stock)

plt.xticks(rotation=45)???

plt.legend()

plt.show()

?


6. Combining Data and Saving as CSV:

?? We combine all the stock price data into a single DataFrame and save it as a CSV file. This CSV file can be used for further analysis and modeling.

data = pd.DataFrame()

for [stock, df] in all[:]:

??? data[stock] = df['Close*']

data.dropna(inplace=True)

data.to_csv('stocks.csv')

?

Conclusion:

Analyzing historical stock data is a fundamental aspect of financial analysis and investment decision-making. With Python and the Yahoo Finance API, you can easily fetch and analyze historical stock data to gain insights into market trends and make informed investment decisions.

This tutorial has provided an overview of a Python script that fetches historical stock data, processes it, and visualizes it. You can further extend this script to include more advanced analysis, such as calculating returns, volatility, or creating predictive models.

By mastering these tools and techniques, you'll be better equipped to make data-driven investment decisions and navigate the complex world of finance with confidence. Happy coding and investing!

要查看或添加评论,请登录

Ali AZARY的更多文章

社区洞察

其他会员也浏览了