How to use color to see your numbers

How to use color to see your numbers

The best way to present data is where the conclusion is clear. If you need to still interpret the data for your audience, you may not have done a good enough job. One of the ways to do this is coloring values based on a scale. I particularly like the jet colormap scheme, but more importantly, the tools to color data and improve visualization are widely available and easy to use.

I have been using Microsoft Excel for close to 20 years, and even though I have mostly switched to Numbers, the one key functionality that I still love about Excel is Conditional Formatting. This allows a coloring to be used for a large set of data, and it helps to visually see trends rather than digging into each value. Typically, this is done by plotting data, but some times, even a regular plot doesn't quite capture all the nuances of the data. Conditional formatting allows this to be done over a large amount of data, quickly, and creatively.

First thing, get some data. I picked some surface temperature data (https://data.giss.nasa.gov/gistemp/). It's an interesting dataset because it is the difference in temperature relative to the average of the temperatures from 1951 to 1980. The reported values are in celsius, so multiply by 1.8 to get the changes in Fahrenheit. The data is per month per year. Some people have interpreted the data one way or another, and my plan is to only present the data.

Firstly, let's look at just a snippet:

That is a lot of numbers. Ugh. Let's add some color based on all the values:

That's a little better, but it is hard to understand the trend because each month has a different trend. Let's color based on each month, so each month will define the minimum and maximum for that color map:

Much better. What about a different color scheme using white instead of yellow as a middle color:

Let's modify the color to have a minimum threshold and maximum threshold:

Now, let's look at some plots. The first is time linear, but there is a lot of noise because of the number of samples. The second is a box plot per year:

Now, let's get weird and add some crazy color:

Let's make it a surface:

Both of these could still be difficult to understand right away. So let's plot all the data in something more manageable:

Let's color based on column:

What happens if we make a video? Would it provide more impact to the conclusion of the data?

Usually, I end up with multiple versions of the same data, presented in different ways before I present. I then try to put myself in the shoes of another, and I remember that I am burdened with knowledge, so the way I look at my graphs is different than the way another would. That's why, I'll end with a plot where I only changed the temperature differences from Celsius to Fahrenheit:


要查看或添加评论,请登录

Dr. Robert McKeon Aloe的更多文章

  • Ph.D. Interviews

    Ph.D. Interviews

    I have interviewed mostly Ph.D.

  • How to break into Data Science the easy way

    How to break into Data Science the easy way

    Scratch that; there’s not an easy way. Data science has become a hot topic the past few years along side machine…

    5 条评论
  • ML: Examining the Test Set

    ML: Examining the Test Set

    I recently saw a post where someone said “Never touch your test set.” The theory was that you (as the algorithm…

    8 条评论
  • Privacy in Machine Learning: PII

    Privacy in Machine Learning: PII

    Privacy is not a value explicitly written into the US Constitution, but the essentials are there. As a democratic…

    1 条评论
  • Mastering LinkedIn

    Mastering LinkedIn

    Account Creation I never had a LinkedIn account until I was searching for a job, and then I only paid attention to it…

    1 条评论
  • Withdrawing a Conference Paper

    Withdrawing a Conference Paper

    In graduate school, I tried all sorts of optimizations aimed at making my face matcher work better and faster. I found…

    1 条评论
  • Thoughts on Leaving

    Thoughts on Leaving

    Relax, I’m not leaving my current job right now. I’ve been writing about many different aspects of my work experience…

  • Crashing the Student Computer Lab

    Crashing the Student Computer Lab

    In my last year of graduate school at Notre Dame, I used over 1,000,000 computer hours or just over 114 years of…

    3 条评论
  • Presentation Essentials

    Presentation Essentials

    I have fallen asleep in my fair share of presentations, and I’ve worked hard at making sure my presentations are not…

  • Design of Experiment: Data Collection

    Design of Experiment: Data Collection

    Anyone can collect data; some people can collect good data. The key theme to any good data collection is data…

社区洞察

其他会员也浏览了