What Your Favorite Business Metrics Won’t Tell You About Your Customers
Daliana Liu
"Technical → Influential" | Ex-Amazon Sr. Data Scientist | 290k Followers | Helping Data Scientists & MLEs Build Influence & Carve Out Their Own Paths
Do you only use metrics like ‘mean’ or ‘ratio’ to make data-driven business decisions? If so, you are probably doing business analytics wrong.
I’ll use two data analytics cases to show you why only focusing on those metrics can be dangerous, and what you should do instead.
Case 1. The Average Value Can't Represent Your Customers
- Why?
We use average a lot when analyzing product and business performance, but using average alone creates blind spots. Because there are always variations due to different segments of the market or pure randomness, and the average value doesn’t tell you the variation of stories.
- Example: How many products do our customers buy on average?
A company is trying to understand the average amount of items purchased by a customer. For New York and LA, they found that the average of the items purchased per customer is the same (45 items).
Now, based on the plot below, should we apply the same marketing strategy for the customers in New York and LA?
NO.
In LA (green line), 85% of customers purchased 40–50 items, which means the average amount(45) can represent most customer’s behavior. You might only need one big campaign to target the majority.
However, in New York, the average value can only represent 50% of the customers’ behavior. The majority of customers, say 85%, lie between the buckets of purchasing 10 items to 80 items, which we can observe from the large ‘spread’ of data as shown by the orange ‘dumbbell’ shaped line.
This means, the customers in New York have more variance than those in LA, and you probably need multiple campaign strategies for New York when the customers’ behaviors are more diversified.
- What Should We Do?
Find out the range around the mean by calculating variance.
Typically, data scientists report a Confidence Interval (CI) to estimate where the average lies in with a probability. (This link can help you construct a Confidence Interval, and you can create it within Excel)
An example of reporting is: the mean of item purchased per customer in New York is 45, and the 85% Confidence Interval is between 10 to 80.
Case 2. Ratio Metric Can Be Very Sensitive and Unreliable
- Why
Ratio metric consists of at least two metrics; for example, Click-Through Rate is Clicks divided by Views. With each metric’s variation, the ratio metric’s variation is more complicated, and it doesn’t follow any common distribution.
- Example
Let’s look at the table below first. You are measuring Click-Through Rate, from this table, it looks like Click-Through Rate increased from Jan to Feb. Sound great?
Well, actually both Clicks and Views decreased, it’s just because the Views decreased more. So this increase is probably not what you want.
Now, let’s look at the 4 more scenarios to see how Click-Through Rate changes when we control one variable and change the other. Can we trust the ratio with the same level of certainty in each scenario?
The Left Table shows that, if the denominator (View) is stable, the ratio metric moves proportionally as the numerator (Click) moves, and the uncertainty of the data is easy to estimate, and the scale of uncertainty doesn’t change much.
In the Right Table, when the denominator (View) is large enough as shown on the first few rows, ratio (CTR) is very stable with only 1–2% uncertainty. However, if you look at the bottom rows, the ratio can be very sensitive to changes and unstable when the denominator is small! When this is the case, it's better to monitor the Views and Clicks and expects a wide range of scenarios when you make decisions.
What Should You Do?
- Set a threshold for minimal acceptable value for the denominator. As the ratio can have a great variance when the denominator is small, we only trust the ratio when the denominator is large enough. If you have to use the ratio metric to make decisions when the denominator is small, make sure you report a range that covers the fluctuation.
- Monitor the actual values (numerator, denominator) that we use for the ratio calculation. Understand the range of the ratio by simulating different scenarios of the numerator and denominator.
Take-Aways For Your Business Analytics Strategy:
Data analytics is not just calculation, it’s also the measurement of uncertainty.
While summary statistics like mean or some ratio metrics help us ‘Zoom Out’ and see a big picture of data and our business, we also need to ‘Zoom In’ for the range and shape of data, to make sure we understand the uncertainty associated with the metrics.
- A data point is NOT enough! Create the range around it, and use variance to estimate uncertainty or different segmentations of the data.
- If your metric is ratio like Click-Through Rate, analyze different scenarios to see how the metric change as the denominator and numerator change. Be careful if your denominator is small, which means the ratio can be more sensitive to the change of data, and might not be reliable!
- Visualize it to make sure we don’t miss any pattern in the data or outliers.
Share or Like this article if you find it helpful for other people!
Want to get more free tutorials on business analytics and data science? Click the link or image below to get my newsletters!
Get my FREE Data Analytics Articles!
#business #analytics #datascience #product #marketing
Partner Alliance Marketing Operations at Data Dynamics
1 年Understanding the nuances of data variance and uncertainty is crucial for making informed decisions. It emphasizes the importance of not only analyzing summary statistics but also delving deeper into the range and shape of data to grasp its true meaning. A great reminder that a single data point is not enough, and visualizing data can help uncover valuable insights.
Growth Data Science at Square
5 年Love this article! Agree that ratios can be very sensitive and we need to zoom in into the numbers sometimes. For large-volume data, such as traffic of an entire website, I feel comfortable using ratios to monitor performance. But for smaller-scale data, such as metrics related to a new web page or a small campaign, I definitely would look at the absolute values as well. Super helpful article!
Instagram Ads
5 年awesomr Zhen!