Data Visualisation: A Four-Step Process for Success

Data Visualisation: A Four-Step Process for Success

Did you know that there are about 80 billion neurons in the brain, and 35 billion of them are dedicated to visual processing?

I haven’t counted them myself, I’ll admit. That’s what the research says, though.

In fact, some scientists even believe that our other senses are losing out as our brains try to give more space to our visual powers. There are only so many neurons to go around, after all. They reckon smell will be the biggest loser.

We all know the power of images. That’s why we are inundated with visual stimuli every day from every company on every channel. I’m sure a few images you’ve seen this week will stick with you.

It’s tricky to get one’s point across within all that clutter. And images can confuse, as well as illuminate.

There is therefore an art and a science to data visualisation, neither of which I have mastered.

It is a fascinating topic however, and I hope you can learn from my stumbles.

We can also become better “consumers” of data by understanding some of the key principles we’ll discuss.

This is part two of the hi, tech. data series. You can check out?Part I: Inspiration?if you like.

In this edition:

  • The data visualisation process
  • The right story
  • Selecting the visuals
  • Simplify
  • Annotate
  • Further inspiration
  • And lots of dos/do-not-dos along the way

The Process

The process, such as it is, runs a bit like this:

1. The right data for the right story

Yes, yes, you have heard a million times that people are “wired for story” or whatever.

So I’ll keep this part short.

In the 2021?Data Visualisation: State of the Industry?survey, the sixth most popular tool was “pen and paper”. There are hundreds of relevant tools out there, and data types still use a pen and paper all the time.

I hear quite often that people have loads of data and no idea where to start. This is typically because they are staring at confusing dashboards, rather than coming up with questions to ask of the data.

Sketching out your “storyboard” will help you work out the kinks in your ideas, then figure out which data points will get the ideas across. You’d never know it, but I always sketch out the charts I want to make before I start.

There are different ways into this, depending on where you are with the data at your disposal.

I find that there are two steps:

  1. Working with data to find things out.
  2. Visualising the data to persuade others of what you found.

This edition will focus more on the second one, but we’ll come back to number 1 in a future edition.

So you’ll maybe have an idea of what you need to convey, meaning you can look for the data to substantiate the argument.

Or, you may be in an exploratory mindset where you’ll figure things out within the data set as you go. The story can be framed as a series of questions, with each data-based answer moving the argument along.

Then consider the context and what you want the audience to believe/think/feel by the end.

For practice, these are great data resources:

And the publications I shared in the first data special are excellent for a bit of inspiration.

2. Selecting a visual

A lot of tools, including Excel, Tableau, and Sheets will auto-suggest charts based on your data. As we’ll see, they don’t always get it right.

There are some general rules that will help you select a chart. This is a classic resource to come back to when you are stuck:

No alt text provided for this image


That may be tricky to read depending on your screen size, but?the hi-res version is here.

These are helpful guides and they should keep you broadly on track.

Yet we can also think of these options in terms of?precision.

  • How precisely do we need to make our point?
  • Which comparison are we really trying to make?
  • Are we portraying a sweeping trend or a specific causal relationship?

We should also consider the audience:

  • Will they spend time interacting with the visuals?
  • Do we need key findings to jump out at them?

Answering those questions can direct us towards the right chart format.

At the more precise end of the scale, we can easily digest a straight line between two points. From an evolutionary perspective, those are the kinds of figures we have adapted to identify out in the wild.

Take this example from The Economist:

No alt text provided for this image

It has a clear point to make: The gap between men’s and women’s prize money increases as player rankings increase.

(I say?increase?in the tennis sense, with 1 being ‘higher’ than 2.)

The chart uses a log scale on the x-axis to separate out the higher rankings, grouping the lower order to convey only a trend.

The eye is led to the more significant figures in the top right, just like the ukiyo-e artists did in their famous paintings:

No alt text provided for this image


It’s a well-known artistic trope. The ukiyo-e prints influenced Monet and you can see it in his?The?Thames Below Westminster?painting.

At the higher end of The Economist’s scale, we see the obvious disparity between Djokovic and Barty, but the data points for players in the rest of the top 10 are also visible. This provides essential context to support their argument.

Looking only at the two number 1 players, one could counter that Djokovic won 3 majors in 2021 (and finished 2nd in the other), while Barty won 1 major. Yet that would still not explain the size of the gap — or the similar gaps that keep appearing in the subsequent positions.

For less precision, we can use colour to make a general point. Maps are usually great for this purpose.

The Economist is putting some fantastic work out at the moment, so I’ll use another of theirs as an example.

The image below visualises the most streamed song on Spotify each week, split by language.

The user can see quickly that Spanish-language songs dominate in Spanish-language countries. The publisher has grouped the countries to help the user navigate a vast dataset, too.

No alt text provided for this image

You can then hover over specific data points if you want more detail. These visualisations both manage to combine precise detail with wider trends, but they make it look easier than it is.

?? A warning

I’ll stop off a few times to share some common errors we (and when I say ‘we’, I mean ‘I’) make quite often.

Google Sheets is fond of auto-creating stacked bar charts and I have used them way more than I should have.

(I am reliably informed that?this free plug-in will let you create a wider range of alternatives in Google Sheets. They have?an Excel version here too.)

Below, I have completely made up some data on the penguin populations at zoos in some European countries.

I know we have Humboldts at London Zoo because here I am taking pictures of them in the rain in 2020:

No alt text provided for this image

The rest of the figures, I have made up entirely to make a #dataviz point.

You can see straight away that some countries have more penguins than others from the heights of the bars. That is useful for making a broader point, of course.

No alt text provided for this image

But if you wanted to compare the Gentoo and Humboldt populations of France and Sweden, it would be much harder. We also need to keep looking up at the legend to see which colours represent the species.

We need a flat baseline to make these comparisons. Stacked charts make this very challenging, with the exception of the bottom section.

Here we can see just the Humboldt numbers:

No alt text provided for this image

There is a lot more to be said about the use of colours in data visualisations and lord knows I’m not the hombre to say it. I am very colourblind indeed.

Instead, try this excellent post about how to use fewer colours for more impact:

https://blog.datawrapper.de/10-ways-to-use-fewer-colors-in-your-data-visualizations/

And if you like to see people improving data visualisations, Makeover Monday is the place to go:

https://www.makeovermonday.co.uk/

Every week, data people from around the world work on a data set to try and find the best way to display the key findings. It’s more fun than it sounds.

3. Remove the clutter

Look, if you’re coming to me for advice on keeping things simple, you might as well ask a labrador to teach you to play the piano. We’re very cute and we’ve got lots of enthusiasm, but it’s just not our specialty.

No alt text provided for this image

At least the dog knows where the keys are and likewise, I do know that minimalism is more effective for conveying a message. I just don’t know how to do it.

So let’s look at what the experts do instead.

In fact, let’s go one better. Let’s look at how the experts improved on a visualisation I made by simplifying it.

A few weeks ago, I shared some charts about the Beijing Winter Olympics, including one I made to show the male/female split of athletes at the games since 1924. I have added a section of it here for reference:

No alt text provided for this image

If we go back to our chart selector at step 2, it would recommend pie charts. We want to show composition and we want to show a share of the total.

I added a bigger legend at the top and labelled the first female pie segment to draw attention to the key point I wanted to make. (This is an underappreciated step of the process — I typically export charts to Canva and add my own legends/labels there.)

Yet I could/should have refined this further and this week The Economist (who else?) made a more effective version:

No alt text provided for this image

They took the same data (who knew they had access to Wikipedia over there?) and used lines instead of pies. This makes the trend even clearer, because what I really wanted to show was the slow-but-sure progression towards a 50/50 split.

Their chart looks very simple but the key lies in the thought process behind it. Too often, we (I) also want to show how much work I have put in. Of course, that’s not what really matters ??

Labelling the lines is also a simple, effective practice to improve charts.

?? But watch out!

Well you might be thinking lines are the answer now. This is true, not always.

I put some made-up sales data into the Datawrapper tool to show why.

Let’s imagine the data is for a B2B software company assessing monthly sales over the past two years. Yes, I could have imagined anything in the whole world and that’s what I’m going with. You take the B2B data with the penguin populations around here.

Like Excel, Datawrapper (we’ll cover it in more detail in a future edition) suggests a visualisation for the data you enter. It automatically plotted the below and it looks pretty sensible at first glance.

Take a quick look for yourself and judge: in which months are the greatest differences between 2021 and 2020 sales?

No alt text provided for this image

We note deviations quite rapidly.

That gap between the March figures sticks out because 2020’s data has decreased from Feb to Mar, while 2021’s sales increased in the same period.

That causes us to overestimate the difference.

October 2021 stands out for similar reasons.

What about May and July? Could you tell the difference between those two lines, at those points?

If I plot just the difference between the years, you can see that those months do actually report the largest sales increases:

No alt text provided for this image

Why do we miss this?

Well, we aren’t looking for the vertical difference between the lines. We automatically see the lines as “close” in those months because spatially they are, just on the wrong axis for what we want to judge. The gridlines even guide our eyes towards that conclusion.

It’s an evolutionary thing again. We aren’t equipped to spot these differences because in reality, we didn’t evolve to work in Microsoft Excel. We look to see the outline of shapes, lest they be threats. The threat here — admittedly not of the ‘existential’ variety — is that you may misinterpret my fictional B2B sales data.

Now the bar chart displays the difference accurately, yet it loses the absolute values of the monthly sales. We can tell 2021 was better than 2020 and we can pick out the best months, that’s all.

The solution? I won’t do it here to save space, but you could either combine the two into one chart; or (better) display them as two separate charts, one above the other; or add a table below the line chart containing the data.

4. Annotate

I was reading a language style guide a little while ago, the evidence of which lies before you in this limpid prose. The author wrote about the “curse of knowledge” and it really stuck with me.

Have you ever hidden an item from someone? Let’s say, you put a small object somewhere that, statistically, the other person is highly unlikely to find it.

When they enter the room where you have stashed the item, you are likely to think they will find it much quicker than they will. You may even give away the item’s position through your body language.

Because you know where it is, it seems patently obvious to you. You can no longer imagine what it is like?not?to know.

This is the “curse of knowledge” and it crops up repeatedly in data visualisation. If you go through the process above, whatever you find will be staring you back in the face when you glance at your charts.

There seems to be no need to label the findings. If anything, it feels patronising to the audience.

Yet as we have seen in our examples, the most successful data visualisation really do?tell?a story. They guide the viewer towards the conclusion and they use every tool at their disposal to do so.

Even if this chart won’t make much sense out of context, you can see that the highlights and annotations help explain the main points. And the mini F1 car at the top is just a nice touch.

No alt text provided for this image

Or here, to highlight a key moment in what is otherwise a jumble of lines:

No alt text provided for this image

A tip to help with this:

Ask someone to look at the chart and tell you what jumps out. If they had to summarise what they see in a sentence, what would it be?

And that’s where this becomes a cyclical process. If that response jars with your intention, labelling the findings could help. Or you may need to use a different kind of chart, or even different data.

In the last annual data visualisation survey, the biggest restriction that respondents reported was “lack of time” and you can see why. Even when you arrive at a conclusion, it may need to be reworked altogether.

Perhaps we can learn just as much from what not to do.

Take this, for example:

No alt text provided for this image

I’m sure you can see about 10 things that are wrong with this one.

Next time: We’ll tackle data visualisation tools!

More inspiration

10 of the best data visualisation examples from history to today —?Tableau

The best stats you’ve ever seen —?TED

要查看或添加评论,请登录

社区洞察

其他会员也浏览了