Viz 101 - Pie Charts
My Visualizations (Viz) 101 series is intended for people who want to work in analytics or are just starting out. I will provide you with a description of many different types of visualizations, what they are best used for, and a technical guide on how to create them in Excel (because pretty much everyone uses Excel and has a copy available) and Tableau (because it's an amazing BI tool and Tableau Public is free). In each article I will walk you through a scenario with real-world, publicly available data to provide you with a framework of how you could answer similar questions on your own using various techniques and visualizations.
Why should you listen to me? Here are my credentials: I have a master's degree in Mathematical Finance. I've used Excel for over a decade and while I have used Tableau for considerably less time, I am a Tableau Desktop Certified Associate. I've been working for the past four years as an analyst and data architect for an academic medical center. I live and breathe data and how it is handled and presented is something that I am passionate about. Good enough? Great. Let's get going.
Pie Charts
Pie charts are great, but often misused, visualization for the parts of a whole. They are great for showing general comparisons of size between large (meaning few) categories such as percent of business revenue.
Scenario and Data Retrieval
I am interested in learning about the types of U.S. small businesses that are minority, woman-owned. The Small Business Administration (SBA) has a database of all small businesses that I can use to obtain this information. There are lots of options for filtering the SBA's data, but I want to leave it fairly open to make sure I get all of the minority, woman-owned small businesses. First, I will select active certifications for woman-owned.
Then any minority-owned.
I want to get all applicable records, so I need to change how many records will be returned from the default 25 to "Show All".
Now, to be able to categorize the businesses, I need to change what fields are exported. By default, only Name and Trade Name of Firm, Contact, Address and City, State Zip, and Capabilities Narrative.
By clicking on "Edit the columns to be displayed," I can browse the 60+ fields the SBA has available. For this, I really only need two. The name of the business and the type of business it is, so I remove all the default fields, add the two I want, and click save.
My export is almost ready, but first I need to select the format I want. In this case, I will take .xls (Excel).
Now all that is left is to click "Search using these criteria" at the bottom of the page and wait a few moments to get the export.
Pie Charts in Excel
In my previous posts on column charts and line charts I used the raw data in the visualizations, but in this case, I will have to aggregate the data first. Aggregating data is a way to describe the whole in a single value such as a count, sum, average, minimum, or maximum. In this case, I will utilize a count to determine how many businesses fall into each type.
Upon looking at the data (something you should always do), two things jump out to me. Some rows have a blank business type while others have more than one. I was hoping for a complete field, but sometimes you just have to work with what you have available.
Because of some businesses having multiple types, there are 11 different types (as opposed to the four options on the SBA website that I was expecting). Having that many types may make the visual too busy and hard to interpret, so I am going to write a calculation to simplify things. For a blank business type, I want to change that to "Unavailable". For any business with more than one business type, I want to change that to "Multiple Types". For everything else, I will just copy the business type over. A formula to achieve this could be like =IF(ISBLANK(C2),"Unavailable",IF(ISNUMBER(SEARCH(",",C2)),"Multiple Types",C2)).
Now, I need to perform the aggregation. There are multiple ways to do this in Excel, but probably the easiest is to use a pivot table. All I need to do is highlight my data, navigate to the Insert tab, and click on PivotTable.
On the right side of the screen my newly created pivot table has fields listed. All I need to do is drag my custom made Business Type field to Rows and then Name of Firm to Values. This will automatically default to a count, which is what I want anyway.
Now my aggregation is complete, and I am ready to visualize this data! Pivot tables have a built-in feature to make using the data displayed in pivot tables in graphs easy. Still on the Insert tab, I can navigate to the Charts section of the toolbar and click on PivotChart. This will open the Insert Chart dialog window. Once there, I can pick the type of chart I want to use, which in this case is a Pie Chart. Then click OK.
Now I have a pie chart. I can easily see that roughly half of the businesses have no type assigned, but the other types are a little harder to interpret, especially Manufacturing and R and D. To help with that, I can add a data label to the graph. If I select the graph, a button appears at the top right with a + symbol in it. I can use this to add any number of chart elements, but I will just add the data labels.
That helps, but I think a percent of the total would be meet my need than the count. All I need to do is modify the calculation on my pivot table. With my pivot table selected, I navigate back to the Values section where I put the Name of Firm field. Upon clicking the dropdown next to Name of Firm, I can select Value Field Settings.
From the Value Field Settings dialog window, I navigate to the Show Values As tab and in the drop down for Show Values as, I select % of Grand Total.
Now my pie chart is showing percentages.
领英推荐
However, the data labels for Construction and Manufacturing are overlapping. I could move the data labels around, but I don't really need the default precision on the percentages. To reduce the number of decimals shown, I simply highlight my values on my pivot table and click the Decrease Decimal button twice.
With that change, my data labels are no longer overlapping. Of course, I should at least update my title to be more meaningful. I also like to right-click on the buttons and hide them, so that my final pie chart looks like this.
Pie Charts in Tableau
Now, I want to show how to do the same thing, but in Tableau. If you do not have Tableau Public, you can get it for free here. The only catch to Tableau Public is that if you want to save your work, you must publish it to the public. This means you do not want to use Tableau Public for any proprietary, private, or otherwise restricted data. In all my examples, I will be using publicly available data, so that will not be an issue for this series. Alternatively, you may purchase a Tableau Creator license here (for $840 annually) or use a free, 14-day trial here which will allow you to save workbooks without posting them publicly.
The first step in Tableau is connecting to my data set, which is in .xls format (Excel). Conveniently, that's the first option to Connect to under To a File. Simply navigate through the File Explorer that opens as you would to open any other file. You've done this before, and so have I.
Because the SBA is exporting this file as a very old version of Excel, when I try to import the file as-is to Tableau, I receive an error. To get around this, I simply open the file in Excel and save as .xlsx. Now that I have successfully imported the data into Tableau, in the data preview I can easily spot the same issues I saw when viewing the data in Excel.
To create a calculated field to consolidate the business types like I did in Excel, I simply navigate to a new worksheet, the select Analysis, and Create Calculated Field.
In the dialog box that pops up, I name the new field and enter the calculation
IF ISNULL([Business Type (Service, Construction, etc)])
THEN "Unavailable"
ELSEIF CONTAINS([Business Type (Service, Construction, etc)],",")?
THEN "Multiple Types"
ELSE [Business Type (Service, Construction, etc)]
END
and click OK.
Now, unlike Excel, Tableau automatically creates a count for me, so I can go straight to making the pie chart, which I can do in four clicks. All I have to do is select the dimension and measure I want (hold CTRL), then use the Show Me drop down to view available chart types and select the pie chart.
If I want to add a data label like I had in Excel, I just need to drag the count field to the Label card and then from the drop down select Quick Table Calculation and Percent of Total to go ahead and change it to the percent.
Now, I have some of the same problem I had in Excel where I can't see each data label. By default, Tableau prevents overlapping, but another issue is how small this graph is, so I'm going to start by just making it bigger. One way to do this is by changing the view setting to fit Entire View.
That helped, but there is still one data label being hidden.
I'll try reducing the number of decimals being displayed as I did in Excel. To do that, I will need to open the format for the label and then change the Numbers format to Percentage and reduce the decimal places to zero (0).
That seems to have done the trick, and of course, I update the title to be more descriptive.
Summary
You now have a taste for retrieving online data to answer a question and how to visualize it in Excel and Tableau using a pie chart. Here are three things to keep in mind with pie charts:
I hope you found this helpful and if you did please consider liking or sharing the article. Stay tuned for future articles in this series.
A seasoned healthcare professional with extensive years of experience in both the academic and non-academic environment
3 年Hi Andrew Thanks so much for sharing such informative information!!!