Data Analysis Using Python pandas

In this article, I’ll demonstrate some simple data analysis using the Python Data Analysis Library pandas.

For the data-set, I’ll use the stock price history for Microsoft Corporation (NASDAQ: MSFT) from Jan. 01, 1991 through Dec. 31, 2016. This data-set will be downloaded from Yahoo! Finance using the pandas_datareader module.

  • Import the requisite libraries.
import pandas as pd
from pandas_datareader import data
from datetime import datetime
%matplotlib inline
  • Define the time-frame for the data.
# Get all available data between Jan. 01, 1991 and Dec. 31, 2016
?start, end = datetime(1991, 1, 1), datetime(2016, 12, 31)
  • Download data from Yahoo! Finance into a DataFrame.
msft_history = data.get_data_yahoo('MSFT', start, end).round(2).sort_index(ascending=False)

Now that we have the data-set, let’s perform some simple data analysis.

Let’s find the highest and lowest prices for MSFT’s stock for each year in the data-set.

  • Group by year and get the highest stock price for each year.
year_highest = msft_history.groupby(msft_history.index.year).max()['High']
  • Group by year and get the lowest stock price for each year.
year_lowest = msft_history.groupby(msft_history.index.year).min()['Low']
  • Combine the two series’ generated above into a single DataFrame.
annual_highest_lowest = pd.concat([year_highest, year_lowest], axis=1)

Thus, we can see the highest and lowest prices of the stock for each year. For example, in 2013, MSFT’s stock traded for as high as $38.98 and for as low as $26.28.

Let’s see a graphical representation.

annual_highest_lowest.plot(grid=True, figsize=(20,10)).legend(loc=2,prop={'size':16})


要查看或添加评论,请登录

社区洞察

其他会员也浏览了