Boxplots

Boxplots

The newsletter is moved to substack.

Subscribe here: Newsletter new location

Table of contents:

  1. Why Boxplots?
  2. The Anatomy of a Boxplot
  3. Creating a boxplot in Excel
  4. Boxplots in action

Hello Rockets,

Everything I write here Is Not Rocket Science and today, we're going to dive into the world of Boxplots, an essential tool in the arsenal of anyone wanting to read data properly. But don't worry, we're not delving into complex coding today. Instead, we'll explore how to create these insightful charts right in Excel. Boxplots, also known as box-and-whisker plots, are a brilliant way to visually summarize the distribution of a dataset.

1. Why Boxplots?

Boxplots offer a compact and efficient way of displaying data distributions.

They're incredibly useful for understanding how much the data differs, and skewness (if it leans more towards one side) of the data, as well as identifying outliers. This is particularly handy when we want to compare how different groups vary.

2. The Anatomy of a Boxplot

A boxplot consists of a rectangular box and two 'whiskers'. Here's how they look:

Source:
Source: giphy.com

Ok, what have strawberries to do with this? Nevermind, let’s focus on the image schema below. It’s way better.

Source:

  • The box spans from the 1st quartile (Q1) to the 3rdquartile (Q3) of the dataset, representing the interquartile range (IQR), don’t worry about IQR now.
  • The line inside the box shows the median of the data (2nd Quartile).
  • The 'whiskers' extend from the box to the smallest and largest values within 1.5 * IQR from the Q1 and Q3, respectively.
  • Data points outside the whiskers are considered outlier. Easy.

3. Creating a Boxplot in Excel

Creating a boxplot in Excel is straightforward. Follow these steps:

  1. Prepare Your Data: Organize your data in a single column. If you're comparing groups, make sure each group's data is in a separate column.
  2. Insert a Boxplot: Go to the 'Insert' tab, click on 'Insert Statistic Chart', and choose 'Box and Whisker'.
  3. Customize Your Boxplot: Excel will create a basic boxplot, which you can customize. Right-click on the chart elements to modify the box color, add titles, and adjust the axis scales.

Customizing Boxplots in Excel (Excel is notorious for generating ugly visuals by default)

However, Excel allows a fair amount of customization for boxplots:

  • Change Color and Style: Click on the boxplot to bring up the 'Format Data Series' pane, where you can change colors and styles.
  • Adjust Whisker Length: Right-click the boxplot, choose 'Format Data Series', and adjust the 'Quartile calculation' or 'Show outlier points' options.
  • Labeling Outliers: You can manually add labels to outliers by inserting text boxes or using data labels.

4. Boxplots in Action

Let’s take a real example:

We are analyzing prices of all products in the category of specialized skincare from Amazon.de

There is a total a couple thousand products with prices varying from 1€ to 170€+.

Here’s how the output looks (I used Tableau instead of Excel as it is my weapon of choice but I know not everyone has it. Similar results can be obtained with Excel):

Source: Author’s Own Work. Software: Tableau v.2020.4. Timespan of analysis undisclosed for privacy reasons.

What insights can we extract? Quite a few. Here are some:

  1. 50% of the products of Eucerin are priced in between 18.8€ and 32.6€. Between 1st and 3rd quartile. Also called Inter Quartile Range (IQR).
  2. In median, the prices of La Roche Posay are 43.66% higher than CeraVe. The median price of La Roche Posay products is 32.9€ vs. 22.9€ of CeraVe.
  3. 75% of products of Cetaphil have a price tag lower than 34.6€. That is what the 3rd Quartile shows us. Easy. I knew it.
  4. The most expensive product is from La Roche Posay (the highest orange bubble). It is an outlier also.

Wrapping Up

There you have it! A simple guide to creating Boxplots in Excel and how to leverage that to do pricing analysis. Remember, visualizing data is key to understanding it, and tools like Excel make this task accessible to everyone, regardless of your coding ability.

We’ve looked on what boxplots are, when we can use them and how they’re created. I hope it is a bit clearer now.

Until next time, keep learning. ??

要查看或添加评论,请登录

社区洞察

其他会员也浏览了