Demand Planning: How to Detect Outliers?

Demand Planning: How to Detect Outliers?

This article is an extract from my book Data Science for Supply Chain Forecast, available here.?

“I shall not today attempt further to define this kind of material (…), and perhaps I could never succeed in intelligibly doing so. But I know it when I see it.” Potter Stewart

In 1964, Potter Stewart was a United States Supreme Court Justice. He wasn’t discussing outliers but whether the movie The Lovers was or wasn’t obscene.

As you work on forecasts you will notice that your dataset will have outliers. And even though I know it when I see it might be the only practical definition, these outliers pose a real threat to supply chains. These high (or low) points will result in over-reactions in your forecast or your safety stocks, ultimately resulting in (at best) manual corrections or (at worst) dead stocks, losses, and a nasty bullwhip effect. Actually, when you look at blogs, books, articles, or software on forecast, the question of outlier detection is often eluded. This is a pity. Outlier detection is a serious business

These outliers pop out all the time in modern supply chains. They are mostly due to two main reasons:

  • Mistakes & Errors These are obvious outliers. If you spot such kind of errors or encoding mistakes, it calls for process improvement in order to prevent these from happening again.
  • Exceptional Demand Even though some demand observations are real, it does not mean they are not exceptional and shouldn’t be cleaned or smoothed. This kind of extraordinary sales is actually not so uncommon in supply chains. Think about promotions, marketing, strange customer behaviors, or destocking. Typically, you might not want to take into account for your forecast the exceptional -80% sales you did last year to get rid of an old nearly-obsolete inventory. 

If you can spot outliers and smooth them out, you will make a better forecast. I have seen numerous examples where the forecast error was reduced by a couple of percents just thanks to outlier cleaning. Actually, the bigger the dataset, the more critical it is to automate this detection and cleaning. Let’s see how we can do this. 

In this article, we will discuss three and a half ideas to spot these outliers and put them back to a reasonable level.

Idea #1 — Winsorization

As we said, an outlier is an exceptionally high or low value. Based on this simple definition, a first idea to detect outliers would be to simply cut down the top x most upper and lowest points of the dataset. Let’s see how this would work on the two (dummy) datasets on the tables below.

No alt text provided for this image
No alt text provided for this image

This first technique will simply decrease the top/down x% values of our historical Demand down to the limit of the Xth percentile.

The Xth percentile is a value below which x% of the observations in a group will fall. For example, 99% of the demand observations for a product will be lower than its 99th percentile.

This technique of simply shrinking the Demand down to a particular percentile is called winsorization. The name comes from Charles P. Winsor, a statistician from the first half of the XXth century.

If we look at the 1st and 99th percentile on our two dummy datasets above, this is what we obtain:

No alt text provided for this image

In this table, we see that in both datasets all the low values would be increased up to 4.4. You can see in the figure below that this cuts a part of our dataset. The high values would be decreased down to 16.6 on the dataset without outliers (see Figure 10.1) and down to 70.9 for the dataset with an outlier (see Figure 10.2). 

You might have noticed that the winsorization didn’t give us round results such as 4 or 5, but instead, we got this 4.4. Actually, as we don’t have an exact value that cuts the dataset by 99%, we make a linear approximation based on the two closest values. This is how we got these numbers instead of round numbers.

No alt text provided for this image
No alt text provided for this image

So, are we happy with this technique? No, we’re not.

  • We have spotted fake outliers on a dataset without outliers.
  • On the dataset with outliers, we haven’t sufficiently reduced the outlier (it went from 100 to 70.9).

Of course, one could simply propose to decrease the higher limit of the winsorization from 99% to 95% to reduce the outlier on dataset #2 further. Still, unfortunately, this would also affect dataset #1. This is not the right solution. One could also propose to remove this lower limit so that we don’t increase our Demand to 4.4. But, what if we have periods with missing Demand? Shouldn’t we clean these as well, if any?

Do It Yourself

Excel You can easily get the different percentiles of a range of cells in Excel by using the formula =PERCENTILE.INC(range, limit). Of course, you’ll have to use this formula once for the upper limit (with a value around 0.95–0.99) and once for the lower limit (with a value around 0.01–0.05).

Python We can easily winsorize our dataset in Python thanks to NumPy. We can compute the different percentiles of an array thanks to the np.percentile(array, percentile) function.

import numpy as np
higher_limit = np.percentile(array, 99)
lower_limit = np.percentile(array, 1)

Note that the percentile function takes a percentile expressed as a value between 0 and 100 and not a ratio (i.e., a value between 0 and 1) like in Excel.

We can then simply cut the array to these lower and higher limits thanks to the function np.clip(array, min, max):

array = np.clip(array,a_min=lower_limit,a_max=higher_limit)

Idea #2 Standard deviation

As we just saw, winsorization wasn’t the perfect way to exclude outliers as it would take out high and low values of a dataset even if they weren’t exceptional per see.

Another approach would be to look at the demand variation around the historical average and exclude the values that are exceptionally far from this average.

Let’s define the demand standard deviation as:

No alt text provided for this image

where n is the number of demand observations we have.

If we assume that our data is normally distributed around the historical mean we can compute the probability for the Demand to be between two thresholds. The exact mathematics involved here is beyond the scope of the article, and unfortunately, more often than not the assumption of normality is not strictly respected. These two thresholds will be centered on the demand average (μ) with a spread of x times the standard deviation (σ) in both directions. The more chaotic the Demand (i.e.σ is significant), the wider the thresholds.

No alt text provided for this image
No alt text provided for this image

For example, we have a 98% probability of being in the range: demand average +/- 2.33 times the standard deviation (as in the figure above). So that if we wanted to remove the top 1% of both high and low values, we would restrict the Demand to μ +/-2.33 σ.

Note that this means we have a 99% probability of being lower than μ + 2.33 σ. And a 99% probability of being higher than μ — 2.33 σ.

If we applied this to our example datasets (see the first two tables), we would get these limits:

No alt text provided for this image

Let’s see how these new normal limits behave compared to the winsorization limits.

No alt text provided for this image
No alt text provided for this image

This is already much better than the results we got with winsorization:

  • On the dataset without outliers (see figure 10.4), we don’t change any demand observation (perfect! — just as we want).
  • On the dataset with an outlier, we don’t change the low-demand points but only the actual outlier (see figure 10.5).

Still, even though we reduce the outlier to a more manageable amount (47.9) than with the winsorization (70.9), it might not be enough yet.

So, are we happy now? Not quite yet.

As you might remember, we assumed the error to be around the historical mean. This is fine for a product with flat Demand, but the actual limitation will arise when you have a product with a trend or a seasonality. For example, on the seasonal table below, the highest (or lowest) points are no longer the outliers you want to remove.

No alt text provided for this image

You can see how winsorization and normalization work on this seasonal Demand in the figure below.

It simply doesn’t make sense: both techniques flag the season peaks as outliers, and they skip the real outlier, which is Y2 M11.

No alt text provided for this image

We will solve this with our next technique.

Do It Yourself

Excel You can compute the standard deviation of a range of cells thanks to the formula =STDEV.P(range). As always, you can calculate the mean thanks to =AVERAGE(range). Once you have these two, you can compute the higher and lower limits thanks to =NORM.INV(percentile, mean, stdev). Typically, you will want the upper percentile to be around 0.99 and the low one around 0.01.

Python You can calculate the standard deviation via np.std(array) for an array-like (e.g., a list, a DataFrame, etc.) or a DataFrame directly via the method .std(). So that if you have a DataFrame df, you can simply type:

m = df.mean()
s = df.std()

We will then once again use the SciPy library to compute the normal probabilities. We’ll then use the .clip method on our DataFrame to cap it to our limits.

from scipy.stats import norm
#Print the probabilities of each demand observation
print(norm.cdf(df.values, m, s).round(2))
limit_high = norm.ppf(0.99,m,s)
limit_low = norm.ppf(0.01,m,s)
df = df.clip(lower=limit_low, upper=limit_high)

Idea #3 Error standard deviation

The second idea we had to flag outliers was to compare each observation against the mean of the Demand. We saw that it didn’t make sense if we had a trend or a seasonality as the difference between an observation and the historical mean wasn’t relevant.

Well, let’s go back to the definition of an outlier: an outlier is a value that you didn’t expect. Just like the Spanish Inquisition in Monty Python shows. That is to say that an outlier is a value far away from your prediction (i.e., your forecast). To spot outliers, we will, therefore, analyze the forecast error and see which periods are exceptionally wrong. To do that, we’ll use the standard deviation approach that we used previously.

Let’s take back the seasonal example we made above. We will compare the historical Demand to a simple (but seasonal) forecast we have for it.

No alt text provided for this image
No alt text provided for this image

If we computed the error we have for such a forecast (which is simply an average of the historical Demand), we would obtain a mean error of 0.4 and a standard deviation of 3.2 (this is, of course, heavily impacted by the error we have for Y2 M11). If we took a 99% confidence interval around this mean, we would shrink forecast errors into -0.4 +/- 2.33 x 3.2 = -8,7. You can see in the figure below how these limits around the forecast perfectly fit the seasonal Demand.

No alt text provided for this image

We can now correct our outlier from Y2 M11. The Demand was 19, but the forecast was 5 for this period. The maximum acceptable value is then 5 + 7 = 12. This means that we can replace the outlier of Y2 M11 (19) by this new value (12).

Conclusion

With this smarter detection method — analyzing the forecast error deviation instead of simply the demand variation around the mean — we will be able to flag outliers much more precisely and reduce them back to a plausible amount. As you can see in the figure above, normalization and winsorization couldn’t achieve any meaningful results for this seasonal Demand.

The fine-tuning of this method, how many standard deviations should you take as a limit? is — of course — left to you to experiment…

Do It Yourself

Python If you have a pandas DataFrame with one column as the forecast and another one as the Demand (the typical output from our exponential smoothing models), we can use this code:

df[“Error”] = df[“Forecast”] — df[“Demand”]
m = df[“Error”].mean()
s = df[“Error”].std()
from scipy.stats import norm
limit_high = norm.ppf(0.99,m,s)+df[“Forecast”]
limit_low = norm.ppf(0.01,m,s)+df[“Forecast”]
df[“Updated”] = df[“Demand”].clip(lower=limit_low,upper=limit_high)
print(df)

Go the extra mile!

If you think back about our idea to analyze the forecast error and make a threshold of acceptable errors, we actually still have a minor issue. The threshold we compute is based on the dataset including the outliers. This outlier drives the error variation upward so that the acceptable threshold is biased and overestimated. To correct this, one could actually shrink the outlier not to the threshold calculated based on the original demand dataset but to a limit calculated on a dataset without this specific outlier. Here’s the recipe:

  1. Populate a first forecast against the historical Demand.
  2. Compute the error, the error mean and the error standard deviation
  3. Compute the lower & upper acceptable thresholds (based on the error mean and standard deviation).
  4. Identify outliers just as explained previously.
  5. Re-compute the error mean and standard deviation but excluding the outliers.
  6. Update the lower & upper acceptable thresholds based on these new values.
  7. Update the outlier values based on the new threshold.

If we take back our seasonal example from above, we initially had a forecast error mean of 0.4 and a standard deviation of 3.22. If we remove the point Y2 M11, we obtain an error mean of -0.1 and a standard deviation of 2.3. That means that now the thresholds are -5.3,5.2 around the forecast. Our outlier in Y2 M11 would then be updated to 10 (instead of 12 with our previous technique).

No alt text provided for this image

Do It Yourself

We’ll take back our code from our previous idea and add a new step to update the error mean and standard deviation values.

df[“Error”] = df[“Forecast”] — df[“Demand”]
m = df[“Error”].mean()
s = df[“Error”].std()
from scipy.stats import norm
prob = norm.cdf(df[“Error”], m, s)
outliers = (prob > 0.99) | (prob < 0.01)
m2 = df[“Error”][~outliers].mean()
s2 = df[“Error”][~outliers].std()
limit_high = norm.ppf(0.99,m2,s2)+df[“Forecast”]
limit_low = norm.ppf(0.01,m2,s2)+df[“Forecast”]
df[“Updated”] = df[“Demand”].clip(lower=limit_low,upper=limit_high)
print(df)

About the author

Nicolas Vandeput is a supply chain data scientist specialized in demand forecasting and inventory optimization. He founded his consultancy company SupChains in 2016 and co-founded SKU Science—a fast, simple, and affordable demand forecasting platform—in 2018. He enjoys discussing new quantitative models and how to apply them to business reality. Passionate about education, Nicolas is both an avid learner and enjoys teaching at universities: he has taught forecasting and inventory optimization to master students since 2014 in Brussels, Belgium. He published Data Science for Supply Chain Forecasting in 2018 and Inventory Optimization: Models and Simulations in 2020.

No alt text provided for this image


Branimir ?u?ek

Supply Chain Analytics | Predictive Replenishment | Demand Planning | Optimization | SQL | Python

5 年

In my work I use winsorize() function from scipy.stats.mstats package. https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mstats.winsorize.html#scipy.stats.mstats.winsorize ?I suppose it could be a little faster on large numbers of SKU's sales data and is easy to implement.

回复
Edward Hand, CPIM

Inventory Planning Manager at SAIC

5 年

Great information, thanks! My ongoing struggle with this topic is less detecting the outliers and more what to do with them to ensure the most accurate forecast.

回复
Sergej Kaiser

??Data Engineering & Data Science | GCP Certified Professional Cloud Architect | Cloud Engineer

5 年

In my experience those classical approaches are not well suited to detect outliers in supply chain demand data since its questionable to rely on the assumption normal distributions. In my experienced with supply chain data from different industries, outlier detection based on robust statistics works very well. It seems that robust statistics is not popular or known among supply chain practitioners, even though it has a lot of potential.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了