登录查看更多内容

"Average" Stinks

Stephan Mathys

Storyteller by nature, actuary by training | FSA, Author, Speaker

发布日期: 2019年11月22日

+ 关注

Your "average" stinks. Doesn't matter what that average is, I know it's rotten to the core.

Average cost per click.

Average user time on page.

Average length of time to close a sale.

Average height of a delivery driver who fills up your vending machine.

Regardless of what you're talking about, if you're talking about "average", you're doing it wrong.

Why?

Because average, as a measurement or a descriptor, sucks.

I'm not the only one saying it.

"Average sucks" gets 83 million hits on Google. A German advertising agency agrees. All the time, people are wondering how to get better than average, because, well, reasons:

When you're average, there's a lot of people who are better than you. Yes, there's a lot of people who are worse than you, too, but in this success-obsessed culture, being average is often seen as being a failure.

And, despite the subtle indications that this post is going to be another about self-improvement, I'm actually going to talk a bit differently about average and how its use is failing you.

I'm going to talk about why using average, as a descriptive metric, is holding you back. It's actually obscuring your insights, and keeping you from learning what you could from your analyses and your models.

Quick Recap

Okay, I shouldn't have to do this, but just so everyone's on the same page: when I'm using the word "average" hear I mean the mean, or the sum of the individual components divided by the number of components. If there are 3 dice rolled, and they are a 1, a 3, and a 6, then the "average" roll is 3.33 ((1 + 3 + 6) / 3 = 3.333).

There's also the "median" and "mode", other measures which may sometimes be used for average. But here, we're talking mean, as is usually considered when looking at values.

Organizations and individuals all over the world use average, to describe something like a central tendency of a group. They look at the average high or low temperature of the place they're going to vacation next month, so they know what to pack:

Map of average temperature of the United States in March, 2018. From Climate.gov

The problem is, most of the time, things that are "average" don't actually show up in real life. Like the dice rolls above: even though the average is 3.33, it's actually impossible to get 3.33 on any one specific roll. Sure, roll a die a hundred times and you should get a total pretty close to 350, for an average of 3.5, but on any one specific roll? You will NEVER actually get 3.5

[side note - don't you hate it when Google doesn't play nice? I have this idea that there's a Far Side cartoon perfectly describing the phenomenon I'm thinking of, but I can't find it. It's a family with the average "0.5 dogs and 1.5 kids", and the dog and one of the kids are just like the "left" side, not the whole thing.]

So in this situation, making some kind of prediction or evaluation based on averages is rather meaningless. Anything that's a quantum action (how many times did someone click on my website?), or isn't a continuous variable (what are the expected values from rolling a 2D20?) probably shouldn't be evaluated using "average".

And yet, we do it all the time.

We Talk About Average Way Too Much. Distribution, Not Enough

So is there a solution? Clearly.

This guy goes into more detail than I'm prepared to right now. The point is, though, that talking about percentiles of a distribution provides vastly more information than simple averages.

Here's a real clear example. These two data sets have exactly the same # of elements and the same average:

Obviously, I manipulated this data set a little bit to prove a point. The first column, Set #1, is just a set of random numbers. The second column, Set #2, is an operation on the first, for all but #20. That last one? Well, I solved for the value that would make the total and average the same (for this example).

The point is, if you were just looking at 2 different data sets, or perhaps how a data set modified over time (you're looking at compile times for your program, for example, and checking to see whether your servers are performing better than they were last year), you may be missing crucial data if you only look at average.

Your average may be exactly the same, or only slightly worse than before, but you may have introduced significant outliers that are being obscured by the measurement. Here's the same data, but with some percentiles added on (some rows omitted for clarity):

[Don't hate, I know that Column 2 is different, I've still got Rand() in the formula...]

With this presentation, it would be obvious that something was systematically different between Set #1 and Set #2, and you'd have an indication that you had more investigation to work with.

Uncovering Outliers

And this example was for a set with very few elements. What happens if you've got a vast data set (something accessing big data, for example, or a time series with daily stock prices stretching back generations across thousands of stocks)? How might looking beyond averages help you identify problems?

Well, one way is to look at results graphically. Here's what I mean.

Again, I have two data sets, much larger than before, but I'm still simplifying for the example. [This is based on an actual issue I encountered while still an actuary.]

Let's assume that you've got some kind of measurement which produces these values:

What's going on here? Average looks good. Minimum, 25th percentile and 75th percentile look good. Spot-checking one or two seems right. How are we getting a max of 11.79? And, importantly, is that a problem?

The reason this is a problem is that the maximum that should come out of this expression is 10.0. Mostly because I forced the issue by defining these two columns to be a formula of Rand()*10, which means I expect the max to be no more than 10.

Hmmm... is it something in the data? Let's look. First, I sorted from smallest to largest (as you might do for a stochastic simulation):

As you can see, there's something strange going on. [You can't really see Set 1, but trust me, it's there.] The strangeness is that jump at the end. Something systematic? Might this suggest there's an error in the model, or a source data element?

Let's dig further. What if I go back and reorder by item number (or scenario number, in a stochastic simulation)? What does that look like?

Now we can see that there's clearly something going on with Set 2 in the first few trials or so.

When I look back through the model, I see that I had a slightly different formula in the first 100 cells. Instead of Rand()*10 (to force it to be a random number between 0 and 10), I had Rand()*10 + 2.

Yes, this was a little bit of a contrived example. But a similar experience actually happened during some stochastic testing of insurance liabilities once. When reviewing results, the average looked reasonable and the same results ordered smallest-to-largest looked reasonable. When we looked at the results in simulation order, though, we saw that there was something different about a set of early results.

It turned out that these stochastic scenarios were reading the inputs for a set of deterministic scenarios for that first batch, and throwing off the ultimate effect of the model.

It didn't take long to correct those inputs, and re-run. But if we hadn't looked at more than just the average, we never would have caught the mistake.

Yeah, But Is It That Worth It?

I don't know. Some might find this a level of detail too specific for much of your work. But, when you're dealing with huge data sets, complex relationships, and razor-thin margins for error, perhaps it's not too precise.

Would the company have made different decisions about the insurance portfolio had those erroneous results been incorporated into the regular reporting? Probably not. The magnitude of their error wasn't that great, just like the magnitude in my contrived example wasn't that big. Heck, it wasn't even large enough in the first 100 to move the average. So is it that big a deal?

Well, unfortunately, the answer is the standard: It depends.

Sometimes it will be. Sometimes it won't be. And there's no cut and dried formula to tell when it is and when it isn't worth it to investigate your results for anomalies further.

Some of it comes with experience. Some of it comes from just being curious and following intuition. Some of it comes from your superiors needing to be absolutely sure of every decimal point you can give them, so you do what you're asked for without worrying about it.

But, eventually, you'll learn to add your own systems for spotting anomalies. And you'll implement them early enough in your process that you can head off distractions before they appear.

Look, I'm all for taking shortcuts when they're called for. Nobody really needs to take the back roads every time. That's why we built the highways, damn it. That's also why, to be frank, your average stinks. It's a shortcut, and, as I've shown, just using an average (heck, even just using percentiles alone) can keep you from the insights you need to make informed decisions.

Because even with shortcuts, automation, dashboards, and whatever comprehensive views your C-suite is looking for, sometimes it's good to actually get back into that data sandbox and play around a little bit.

Who knows - maybe you'll see me there.

要查看或添加评论，请登录

Stephan Mathys的更多文章

What I've done well

2024年9月12日

What I've done well

Periodically, it's good for us as professionals to review all that we've done. Because we are way more than our resumes…

7 条评论
Be Careful What Gates You're Keeping

2022年6月26日

Be Careful What Gates You're Keeping

There’s been a tidal wave of “political” posts on this platform in the recent days since the Dobbs v. Jackson…

32 条评论
Communication, Communication, Wherefore Art Thou Communication?

2021年1月13日

Communication, Communication, Wherefore Art Thou Communication?

What's the number 1 skill for business professionals? You might be surprised to find out that it's not how to write…

1 条评论
A Reading List for Life

2020年3月25日

A Reading List for Life

We've seen all the posts recently about how many people are going to binge watch their favorite show during the…

5 条评论
How to Communicate in a Crisis

2020年3月24日

How to Communicate in a Crisis

It’s no secret that the novel coronavirus (and COVID-19, the disease which results from an infection of that virus) is…

9 条评论
Grab 'Em By The Eyeballs; A Case Study of Good and Bad Headlines

2020年2月26日

Grab 'Em By The Eyeballs; A Case Study of Good and Bad Headlines

Note - this article originally appeared on SJMCopywriting.com.

1 条评论
What "Show, Don't Tell" Means, and How To Use It To Your Advantage

2020年1月7日

What "Show, Don't Tell" Means, and How To Use It To Your Advantage

Note, this article was originally published at SJMCopywriting.com.

2 条评论
How to Organize Your Writing - Work Backwards

2019年12月16日

How to Organize Your Writing - Work Backwards

note - this article was originally published on SJMCopywriting.com.
Critique My Code, Please

2019年11月25日

Critique My Code, Please

[Image from Jae Rue at Pixabay] ROUTINE ShouldYouHireAWriter(ContentCalendar, SubjectMatterExpertList) DIM…
GIGO is a Sure Thing. The Converse, QIQO, is Nowhere Close.

2019年11月23日

GIGO is a Sure Thing. The Converse, QIQO, is Nowhere Close.

note, this article was originally published at sjmcopywriting.com: Garbage In, Garbage Out –pretty much everyone who’s…

See all articles

"Average" Stinks

Stephan Mathys

Storyteller by nature, actuary by training | FSA, Author, Speaker

Stephan Mathys的更多文章

社区洞察

其他会员也浏览了

The Zero Conversation Economy: When Silence Speaks Volumes

Insights vs Action – Tips and Tricks for getting things done!

ViableView Lifetime Deal: Dominate Your Market with Real-Time Insights

Don’t let uncertainty in 2025 intimidate you. Where there’s change there’s opportunity.

Use Cases: The Bridge to Market Demand & Growth

Driving Revenue Growth in Competitive Markets: Proven Strategies

"Your Conversion Rate is too High"

The First Commandment: Know Thy Customer

March Metrics Madness: Data-Driven Decision Making

A framework for opportunity sizing

Stephan Mathys的更多文章

What I've done well

Be Careful What Gates You're Keeping

Communication, Communication, Wherefore Art Thou Communication?

A Reading List for Life

How to Communicate in a Crisis

Grab 'Em By The Eyeballs; A Case Study of Good and Bad Headlines

What "Show, Don't Tell" Means, and How To Use It To Your Advantage

How to Organize Your Writing - Work Backwards

Critique My Code, Please

GIGO is a Sure Thing. The Converse, QIQO, is Nowhere Close.

社区洞察

其他会员也浏览了

The Zero Conversation Economy: When Silence Speaks Volumes

Insights vs Action – Tips and Tricks for getting things done!

ViableView Lifetime Deal: Dominate Your Market with Real-Time Insights

Don’t let uncertainty in 2025 intimidate you. Where there’s change there’s opportunity.

Use Cases: The Bridge to Market Demand & Growth

Driving Revenue Growth in Competitive Markets: Proven Strategies

"Your Conversion Rate is too High"

The First Commandment: Know Thy Customer

March Metrics Madness: Data-Driven Decision Making

A framework for opportunity sizing