Viz 101 - Line Charts
My Visualizations (Viz) 101 series is intended for people who want to work in analytics or are just starting out. I will provide you with a description of many different types of visualizations, what they are best used for, and a technical guide on how to create them in Excel (because pretty much everyone uses Excel and has a copy available) and Tableau (because it's an amazing BI tool and Tableau Public is free). In each article I will walk you through a scenario with real-world, publicly available data to provide you with a framework of how you could answer similar questions on your own using various techniques and visualizations.
Why should you listen to me? Here are my credentials: I have a master's degree in Mathematical Finance. I've used Excel for over a decade and while I have used Tableau for considerably less time, I am a Tableau Desktop Certified Associate. I've been working for the past four years as an analyst and data architect for an academic medical center. I live and breathe data and how it is handled and presented is something that I am passionate about. Good enough? Great. Let's get going.
Line Charts
Line charts are common visualizations used to view things over time. Sometimes line charts can clearly show an upward or downward trend, or with enough data, a cyclical trend (think retail sales every Christmas season).
Scenario and Data Retrieval
I am interested in investing in a particular company. I want to view the stock's performance over a long period of time to see if the stock price has been steadily increasing or decreasing over the past year and if it has outperformed the market.
The first step is going to be determining where I can get historical stock prices. There are multiple places, but one is Yahoo Finance (finance.yahoo.com). If you know the stock ticker for the company you want to evaluate, then you can replace both "TCKR"s in this URL and go to the stock's history directly (https://finance.yahoo.com/quote/TCKR/history?p=TCKR). If you do not know the ticker, then go to finance.yahoo.com and type in the company name in the Quote Lookup box. I am going to look up Keurig Dr Pepper (KDP) in the Quote Lookup.
This will pull up a summary page for Keurig and from it I can navigate to the Historical Data tab where I can view and export historical prices.
On the Historical Data page I have options to change the date range, view prices, dividends, or splits, and change the frequency from daily to weekly or monthly. For my purposes, I want to look at daily data from the past year, so I do not need to change anything. When satisfied with my selections, I click export to get all the data in a .csv file.
Now, if I want to compare this stock to the "market" there are some options. The most well known market indices are the Dow Jones Industrial Average (^DJI), the S&P 500 (^GSCP), and the NASDAQ (^IXIC). Alternatively, I could try to find and use an index based on consumer staples, food and beverages, or non-alcoholic beverages, but for now I am going to compare to the S&P 500. Using the same method I used to lookup the historical stock prices, I do the same for the S&P 500 and download that data. With both files downloaded, I copy one into the other so I have both tables in one file and save it as an Excel (.xlsx) file which will enable me to create a graph and save it.
In reviewing the data, I see that the values look kind of funky as I was expecting to see dates and values in dollars. The first column is a date column, but the next several are stock prices for the open and close and the min and max for the day. There is also a column for an adjusted close (Adj Close), which is considered a more finalized value than the raw closing price for a stock. I am going to use the adjusted close price as my value for each trading day to trend over time. Because that is the only column for price that I want to use, it is the only column I am going to reformat by clicking on the entire column and selecting the '$' in the Number section of the Home ribbon.
I can quickly do the same for the S&P 500 tab I created and then review the data. If I look closely, I will see that some dates are missing like 1/1/2021, 1/2/2021, and 1/3/2021. Well, those aren't there because the stock market does not trade on weekends or holidays, so in this case some missing dates are expected. However, in other data sets it is always a good idea to check through dates for unexpected gaps. After reviewing the data, I am satisfied that I have what I need.
Line Charts in Excel
In my previous post on bar charts I demonstrated how to create a blank chart in Excel and fill it with data. This process can be simplified a bit by selecting the columns of interest and then telling Excel to create the chart. Excel will use the highlighted data to make a chart for me!
To compare this to the S&P 500, I could make another line chart the same way I did here, but comparing two line charts might not work very well. I would rather have both lines on the same chart. Before I do that, I need to sort the S&P 500 data in descending date order to match the KDP data. After sorting, to add another line, or series, to the line chart I simply right-click on the chart and click on Select Data. If you read my previous article, this will be familiar to you because it's how I added the data to the bar chart. To add another series to the chart, all I need to do is click "Add".
I'm going to name the series S&P 500 and then navigate to the tab where I copied the S&P 500 data to and highlight the appropriate range for the adjusted close prices. Then click OK (twice).
Hm...this chart does not look very informative. Why?
Because one series, the S&P 500, has values way larger than the other series. The way the data is displayed on the graph makes it look like there was not much change at all in the KDP stock price, but there is a clear downward trend for the S&P 500. From this, I can't really say for sure if KDP out performed the S&P 500 or not. One way to make this graph more meaningful is to have two different y-axes (one for each series). This will change the scale for KDP and make the movements of that stock price much more apparent.
To add another y-axis to the chart, right-click on one of the lines representing the series and select "Format Data Series".
The Format Data Series pane will open on the right and should be on the Series Options represented by a bar chart. From there, I can select "Secondary Axis" from the Plot Series On options.
Now I can clearly see that KDP had an upward trend while the S&P 500 had a downward trend. This brings up a good point that sometimes the perspective or scale is just as important as the visualization being used.
Now this was a quick way to make the trends of these lines more easily visible, but in some situations it is not feasible. For instance, what if I add another stock that is around $500? Which y-axis would work best? The multiple y-axes may not work well for more stocks and since they are both dollar values it is not really clear which axis is meant for which series.
Another option is to make some calculations for percent changes in stock prices. Then no matter what the dollar value is for the stocks they will all be on the same scale and comparison between them is straightforward.
领英推荐
To set this up, I will use the first adjusted close stock price as a starting point and compare all other adjusted close prices to that first one. To calculate a percent change use the formula 1 + (new - old) / old. In Excel, this amounts to "=1+(F2 - F2) / F2" for the first closing price, which will amount to 1 or 100% because there is no change. For the next row down I will need the formula "=1+(F3 - F2) / F2". However, it will be very tedious to change that formula for each row. If I copy the formula, Excel will keep the same relative cell references, but it will change all of them. Meaning if I copy and paste this formula down one row, Excel will add one to all row numbers. If I copy and paste it to the right, Excel will changes all the 'F's to 'G's. This is not the behavior I want because I always want to reference the oldest (first) stock price in my data. Fortunately, Excel has something called absolute cell references. An absolute cell reference will not change if it is copied and pasted elsewhere. Excel uses $ to denote absolute cell references. Since I do not want the column or the row to change I need to put a $ before the F and before the 253 which makes my formula "=1+(F2-$F$2)/$F$2". (In this case since the column is not changing, I could leave off the $ in front of the 'F's.) Then I change the format to percent and copy it all the way to the top and enter a column header, Percent Original Stock Price.
Now, I can copy the entire column 'H' and paste it on the tab containing my S&P 500 data as long as I put in the 'H' column on that tab (assuming my other data also starts in column 'A'. Then all I need to do is update my line chart to reference the new column instead of the adjusted close price column. The same way I added the S&P 500 to the chart, I can edit the existing the data series by right click on the chart and clicking "Select Data". While here, I renamed the first series KDP and then simply changed the F to an H in the series reference.
Now the graph does not look particularly different, but the axes have updated.
But since I changed the values to be percentages, I can use the same scale and set both series to use the primary axis. With that change I clearly see that one series has a clear upward trend and the other has a clear downward trend. What a different story from where I started!
Now, for a few adjustments. I don't think the 0 is particularly necessary to convey the divergence in these two series. To remove that, I left click on the chart and the y-axis until there is a box highlighting the axis. Then I right-click on it and select "Format Axis".
Then I can change the minimum bounds for the y-axis to 0.4. This will make 100% centered with an upper bounds of 160% and lower bounds of 40%.
Now I want to make a title. Unlike in my previous article, there is no title currently on the chart. To add a new one, I select the chart, then click on the '+' next to the chart on the right side, and check "Chart Title".
I enter the title "KDP Compared with S&P 500 During the past 12 months," but it is immediately apparent that someone else looking at the chart will not know which line is which. I need a legend. The same way I added the Chart Title, I can check the box for "Legend" and Excel will create one for me. If I click on the chevron next to Legend, I can choose the location for the legend. I like bottom for this chart. Now I clearly see that KDP outperformed the S&P 500 during these 12 months.
Line Charts in Tableau
Now, I want to show how to do the same thing, but in Tableau. If you do not have Tableau Public, you can get it for free here. The only catch to Tableau Public is that if you want to save your work, you must publish it to the public. This means you do not want to use Tableau Public for any proprietary, private, or otherwise restricted data. In all my examples, I will be using publicly available data, so that will not be an issue for this series. Alternatively, you may purchase a Tableau Creator license here (for $840 annually) or use a free, 14-day trial here which will allow you to save workbooks without posting them publicly.
The first step in Tableau is connecting to my data set, which is in .xlsx format (Excel). Conveniently, that's the first option to Connect to under To a File. Simply navigate through the File Explorer that opens as you would to open any other file. You've done this before.
Because there are multiple tabs (sheets) in my Excel file, I will need to drag each sheet I want to include to the "canvas". Upon dragging the second sheet to the canvas, Tableau is going to ask me to make a relationship. In this case, the relationship is the date of the stock prices, so I selected "Date" for both tables and Tableau makes the relationship (I know it worked because it is already previewing data for me. If there were a lot of fields, I have a search field I can use to find the fields I need.
Now, I only really want the date and the calculation I did of the percent original stock price. If I click on Open, hold shift, click on Volume, and right-click then I can select hide to put away all those extra fields. Once I've done that to both of the sheets on my canvas, I'm ready to start making my line chart.
To begin making a visualization I click on the worksheet tab "Sheet 1". If you have not used a Business Intelligence (BI) tool before, it can seem a little strange to be working in an environment where the data is not visible, like in Excel. However, creating the visualization will be much simpler in Tableau. One way to make the line chart is to select Date (either one), and both Percent Original Stock Prices. Then click on "Show Me" and select Dual Line. Done.
OK. Fine. It's not quite done because Tableau creates date hierarchies and defaults to year. But this can be easily changed by clicking on the drop down on the "Year(Date)" pill in Columns and changing it to Day. Now we have a similar issue that we already covered in Excel. This graph has two different y-axes and I want them to be the same, so if I right-click one I will have the option to "Synchronize Axis".
Now both axes are the same (I can hide one if I want) and conveniently, KDP is blue and S&P 500 is orange, just like they were in Excel, but wait a minute. In Excel, the S&P 500 had a clearly downward trend. What's going on?
Well, if you were following along I mentioned sorting the S&P 500 dates in descending order. This was an incorrect step (sorry, tricked ya) as both the KDP and S&P 500 data were sorted the same in ascending order from the download. Excel isn't "smart" enough to know that they were sorted differently, it just references the columns and rows that are provided to it for the graph. On the other hand, Tableau reordered the data based on the common field in the relationship that I defined. This is just one example of how Tableau can help prevent mistakes when making visualizations.
Summary
You now have a taste for retrieving online data to answer a question and how to visualize it in Excel and Tableau using a line chart. Here are three things to keep in mind with line charts:
I hope you found this helpful and if you did please consider liking or sharing the article. Stay tuned for future articles in this series.