Beyond Bar Charts and Pie Charts
As the saying goes, "a picture is worth a thousand words."
But drawing a good picture can be harder than telling a good story. And when the story is about data, drawing a good chart can prove to be quite a challenge. Charts are everywhere, yet most are poorly designed, when not flat-out wrong. And while we are taught to draw charts as early as elementary school, we are rarely told how to design them well.
This sorry state of affairs is regularly chronicled in articles denouncing the evils of prevalent yet poorly-understood charts like pie charts. In these diatribes, we are told to use bar charts instead, often without proper explanations for the unique qualities of pie charts.
In this article, I will outline the main benefits and limitations of both charts.
Then, I will introduce a new kind of chart that combines the best of both.
In order to support my argument, I will use a very simple dataset: a count of functions by modules found in the STOIC Intelligent Spreadsheet. If we plot this dataset on a bar chart, we get something like that:
This bar chart (also known as a frequency chart) is really good at one thing: it makes it easy to compare multiple values. For example, even without looking at the counts displayed at the top of the chart, we can see that Statistics and Probabilities have roughly the same number of functions. Or that Date & Time and Trigonometry have exactly the same number of functions, while Math has fewer of them.
These comparisons are made possible (and easy) because all bars share a common baseline (the horizontal axis), and because the human brain is really good at making such comparisons whenever a common baseline is available (Cf. Graphical Perception).
What this chart is not good at is to help us evaluate the contribution of each value to the whole. For example, how many functions are there in total, and what fraction of this total can be found in the Math module. Or what is the smallest number of modules that would give us at least half of all functions? In other words, a bar chart is great for comparing absolute values, but ineffective for comparing relative values.
This is why the pie chart was invented.
Here, we have a half-donut chart (also known as arc chart), because it looks a lot better than a pie chart. In fact, it is not only better looking, it is also more accurate from a theoretical standpoint, but we will leave that for future articles (read this if you cannot wait).
Thanks to the 6 inner ticks that visualize the 0﹪, 20﹪, 40﹪, 60﹪, 80﹪, and 100﹪ thresholds, we can easily guess that Math accounts for about 5﹪ of all functions, and that Statistics, Probabilities, and Text almost account for half of all functions, but not quite.
What this chart is not good at is to help us compare values with each other: for example, we can see that Date & Time, Trigonometry, and Math have roughly the same numbers of functions, but we cannot tell whether these are exactly the same. In other words, a pie chart is great for comparing relative values, but ineffective for comparing absolute values.
Is there a chart that could do both? As you have guessed already, there is one, but before we introduce it, we should present a rarely-used alternative to the pie chart, and before we even do that, we should present a more common alternative: the stacked bar chart.
Clearly, the stacked bar chart is nothing more than the Cartesian equivalent of a donut chart. In other words, while the donut chart uses a polar coordinate system, the stacked bar chart uses a Cartesian coordinate system, but the visualized information is exactly the same.
Unfortunately, the stacked bar chart works well only with a limited number of values, and starts deteriorating once values become small, because labels and values become difficult to display. In the previous example, we had to use smaller font sizes to address the issue.
Alternatively, we could offset labels and values with connectors, like we did earlier on the half-donut chart, but these would make the chart more difficult to read. Another option would be to display the stack horizontally instead of vertically, but this would work only with short labels, and some natural languages like German tend to have long ones.
Bottom line: the stacked bar chart is not an attractive option for the dataset at hand. But a less common yet more effective alternative might come to mind: the waterfall chart.
A waterfall chart is constructed from a vertical stacked bar chart, by offsetting the bars alongside the horizontal axis. In order to make the construction even more explicit, an All sum bar can be added to the left or to the right. When added to the left, the bars follow a decreasing curve, and we get something that actually looks like a waterfall. When added to the right (as in the example above), the bars follow an increasing curve.
This chart is almost as effective as the pie chart for comparing relative values. Unfortunately, because its bars are much shorter than the ones drawn on a bar chart and do not share a common baseline, it makes comparisons between different values really difficult. For example, we can see that Date & Time, Trigonometry, and Math have roughly the same numbers of functions, but we cannot tell whether these are exactly the same or not.
By design, the waterfall chart has the exact same benefits and limitations as the pie chart: it is great for comparing relative values, but ineffective for comparing absolute values. Nevertheless, it is a necessary precursor to the chart that will combine the benefits offered by a bar chart and a pie chart.
Meet the K chart:
The K chart is designed by combining a conventional waterfall chart with what is called a level chart. The latter looks like a bar chart, but uses solid ticks instead of solid bars for displaying values. This design decision allows both charts to be combined together without creating too many visual collisions between the bars of the waterfall chart and the ticks of the level chart. On this chart, the left vertical axis visualizes counts of functions per module, while the right vertical axis visualizes sums of counts of functions. In other words, both axes are congruent (they both visualize counts of functions), but use two different scales.
There are many ways to design this chart, but the most effective is by sorting horizontal categories (modules on this dataset) by decreasing order of values (numbers of functions), and by displaying the All bar on the right, thereby drawing the bars of the waterfall chart alongside an increasing curve. This configuration gives the chart its distinctive K shape, which in turn was used to give it a name — and my youngest daughter is named Kaia...
In most cases, a K chart will work well with a single color, unlike the pie or donut charts, which require multiple colors or tones. Nevertheless, the use of colors on the K chart should not be totally discouraged, for it can serve an important purpose: when the horizontal dimension is nominal (as is the case with our sample dataset), the use of colors can help communicate this fact and distinguish the chart from one that would be produced for an epochal dimension (like a date for example). This is especially important when one tries to design a chart that can be interpreted within a split second, without having to read the titles of axes (our sample charts do not show any, on purpose).
Furthermore, colors (or tones) can play a critical role in the K chart: they help communicate the fact that individual ticks on the level chart correspond to individual bars on the waterfall chart. When using a single color or tone, this relationship is weakened, and the chart becomes a little bit more difficult to interpret. To make a long story short: colors are not bad on their own, it is their careless application that should be avoided. But with the proposed design, the application of colors is very intentional.
All design options are presented on this follow-on article: Variations on the K chart.
By design, the K chart uses the exact same amount of real estate as the bar chart and waterfall chart that it is made of, but conveys a lot more information than each chart can offer individually. Therefore, it is a great candidate to replace them in most instances.
The K chart also offers some unique benefits over the pie chart:
- The display of labels and values is greatly simplified and much more space efficient.
- The horizontal axis can be epochal (temporal with an epoch, like dates).
- The layout can be rotated by 90° (in order to support a portrait output for example).
- There is a clear starting point for reading (left or right depending on language).
- There is a natural place to display the sum of values.
- There is a natural place to display deltas or rates of growth.
- The use of colors is perfectly optional (great for accessibility).
- The horizontal dimension can never be confused for a directional variable.
This leaves the venerable pie charts with only two unquestionable benefits:
- Pie charts are preferable whenever the visualized dimension is a directional variable.
- Humans love round figures (Cf. Why humans love pie charts, Manuel Lima, 2018).
The K chart also offers one major benefit over the bar chart: you cannot use a vertical baseline other than zero. Therefore, you cannot cheat with a K chart like you could with a bar chart. The K chart is a faithful chart. This fact was first observed by my friend Albert L.
The closest known alternative to the K chart is the Pareto chart. Unfortunately, this chart makes use of a line chart to visualize a sum of values for a discrete variable, which is in direct violation of some principles defined by Principia Pictura. Furthermore, the Pareto chart makes it much more difficult to visualize the contribution of each category to the sum. Therefore, the K chart is considered to be a more desirable modern alternative.
The K chart is part of a broad family of charts that I call Univariate Combo Charts. Some of these charts have been drawn for a long time, but they have yet to be studied in a systematic manner by the community of statisticians. We hope to explore some of them through our weekly series of articles on data visualization.
Finally, when evaluating the K chart, one should keep in mind the following:
- A bar chart is always simpler than a K chart, yet conveys less information.
- A pie chart is always simpler than a K chart, yet conveys less information.
- A K chart should be compared to a Pareto chart, not to a single bar chart or pie chart.
References
- Graphical Perception (William S. Cleveland, Robert McGill, 1984)
- Principia Data — Unified Typology of Statistical Variables (Isma?l Ghalimi, 2017)
- Principia Pictura — Unified Grammar of Charts (Isma?l Ghalimi, 2017)
- Revising the Pareto Chart (Leland Wilkinson, 2006)
- Why humans love pie charts (Manuel Lima, 2018)
Senior Product Manager, Analytics @ Piano
4 年Thanks for sharing such insights with the community Ismael. I'm no expert in DataViz but quite keen on this topic and I love to read new ideas. The "K chart" looks quite promising, but I think it will be hard (at least for me) to find ways to use it. Even if the insights are quite different from the ones brought by bubble charts, both visualizations have the same issue: they need a perfect dataset otherwise they have no meaning at all. I'll keep your approach in mind and will try to find good examples that I could use when working on my product analytics.?
Sr. BI Developer ? Data Analytics and Visualization ? Tableau/SQL ? Tableau Certified
4 年Ismael Chang Ghalimi?-?This is very interesting in that it captures a great deal of information into a single view. Your article navigated through the structuring of the visual extremely well. I would be interested to see a version where the grid lines were removed. I look forward to learning how this develops going forward. Thank you for sharing.
Principal Consultant
4 年My take on the K-Chart in Tableau. Less ink and different widths for the bars.
CEO @ STOIC
4 年A friend of mine (Albert L.) made a really good observation about the K chart: by construction, you cannot have a vertical baseline other than 0. Therefore, you cannot cheat with a K chart like you could with a bar chart. In my opinion, this is a major benefit that should be considered very seriously.
CEO @ STOIC
4 年Here is a variation on the K chart courtesy of Bruce Gabrielle from Speaking PowerPoint, produced using Excel. As far as I can tell, he is the second person to have ever produced a K chart, and I really like the twists he gave to our original design. Awesome work Bruce!