Better Model Selection with a New STI_Classification Scheme and Improved Forecast Performance Measurement in the S&OP Process

Better Model Selection with a New STI_Classification Scheme and Improved Forecast Performance Measurement in the S&OP Process

In my career as a demand forecaster and forecast manager I found out early that a data-centric forecast accuracy and performance evaluation process is essential for improved decision making. Moreover, when I made data quality a priority in the predictive modeling process, it became clear that historical data needed to be analyzed on an ongoing basis for quality assurance purposes (before, during, and after creating a baseline forecast for the S&OP process.)

As with any predictive modeling approach, when addressing data quality issues for demand forecasting, you will find that

  • as a statistical methodology, much data analysis in demand forecasting is informal and exploratory.
  • Exploratory Data Analysis (EDA) is open-ended and iterative in nature
  • the steps may not always be clearly defined
  • the nature of the process depends on what information is revealed at various stages. At any given stage, various possibilities may arise, some of which will need to be explored separately.

However, it is important to realize that

  • an?understanding of historical data ?will be enhanced when we can identify key?patterns?in a time series
  • data visualizations ?are beneficial in describing the shape or?distribution of data patterns, model residuals and forecast errors
  • assuming unrealistic (non-robust) distributions for performance data can be misleading when assessing forecasting accuracies of selected methods. Most model-centric modeling procedures typically assume that the underlying error distributions follow a?normal (Gaussian) distribution. This exemplifies an uncritical Gaussian mindset ?commonly found among demand planners and forecast practitioners in supply chain organizations.

No alt text provided for this image

Many demand planners and forecast practitioners make?oversimplified and misleading use?of the arithmetic mean in calculating forecast accuracy measures (as is the case with the MAPE, sMAPE, MAE, MASE, and sMASE) for point forecast performance evaluations. With some EDA, one quickly realizes that it is smarter to create a representative (i.e., more?typical) accuracy measure with a median (as in the MdAPE, sMdAPE, MdAE, MdASE, sMdASE) or even a new outlier resistant HBB TAPE measure to replace the MAPE as described in my book C&C Embraced (Chapter 4).

Managing a Model Selection Scheme for Automated Demand Forecasting

As part of a pre-modeling data cleaning protocol, consider a new STI_Classification unsurpervised ML scheme I introduced in my recent books and LinkedIn articles (1. A New Classification Scheme for Automating Trend/Seasonal Model Selections in Large-Scale Business Forecasting Applications, posted on June 21, 2022; and 2. A New Data-Centric Forecast Model Selection Approach for Sales and Operations Planning Applications, posted July 8, 2022)

No alt text provided for this image

I will use the STI_Class approach with two large, well-known monthly datasets from the M3 and M4 forecasting competitions. The classification scheme segments the data into six non=overlapping regions, so that we minimize a selection bias by dealing with the whole set at once for evaluation.?In this manner, I arrive at six non-overlapping categories in the unit square, where the vertical and horizontal sides are labeled as Seasonal Influence Factor (SIF) and Trend Influence Factor (TIF), respectively:

SIT#1 (Season >Irregular >Trend)
STI#2 (Season >Trend> Irregular)
IST#3 (Irregular >Season >Trend)
TSI#4 (Trend >Season >Irregular)
ITS#5 (Irregular >Trend >Season)
TIS#6 (Trend >Irregular >Season)

Given a monthly dataset (M3 Forecast Competition, 1428 monthly series in this example), I can determine how many series fall in each category: SIT#1 (n = 126); STI#2 (n = 80); IST#3 (n = 210); TSI#4 (n =351); ITS#5 (n = 190); TIS#6 (n = 471). From this, it is evident that the trend dominant series (58%) outnumber the seasonal dominant series (14%). This unbalanced dataset might result in the seasonal models getting outperformed by nonseasonal models. There is some evidence of that in the M3. That is why each category needs to be evaluated independently in the S&OP process, and results not simply averaged over all categories.?

No alt text provided for this image
No alt text provided for this image

To start you can see that seasonal influence in the data are relatively small compared to trend influence, for instance. That suggests that Trend models are likely to do better in the forecast modeling and performance evaluations.

So, if we look at the two segments where seasonal influence is stronger than trend or irregular (any variation not attributable to trend or seasonality), you can see that it constitutes about 14 % (= 206/1428) of the monthly data. The summaries for the 18-month forecasts in the M3 competition are reported here for the MAPE only. It turns out that the distributions using the other point forecast accuracy measures in the M3 data appear very similar.

The point to note here is that using the arithmetic mean to summarize a very skewed distribution is a badly flawed procedure for the MAPE and for the point forecast accuracy measures as well.
No alt text provided for this image
No alt text provided for this image

Moreover, if we examine the class in which the irregular component is dominant over trend and seasonal, you will note a very wide dispersion (left frame below) in the distribution of the MAPEs. By removing the most extreme MAPEs in the distribution it is evident that a typical summary MAPE for the THETA and WINTERS methods are not that different, especially when you consider the MdAPE (more appropriately) versus the MAPE (perhaps, misleading). Certainly, no rankings should be considered in this situation. Because this is an irregular dominant class, the summary measures are higher than the corresponding measures in which the irregular is not the dominant variation.

No alt text provided for this image
No alt text provided for this image

In the case of the MdAPE and HBB TAPE, for example, the typical value would never be indeterminate (zero in the denominator of an APE). All this, and more data-centric approaches, in my recent books?Change & Chance Embraced :?Achieving Agility with Smarter Forecasting in the Supply Chain, and Four P’s in a Pod , e-Commerce Forecasting and Planning for Supply Chain Practitioners. The books are available online on Amazon websites worldwide, as Kindle e-book, paperback, and hard-cover along with some five-star reviews , like this one.

If you are a professional forecaster who doesn't know it all, this book will fill in what you need to know. If you are professional forecaster who knows it all, this book will support your knowledge, give you a reference to support your knowledge and perhaps show you that you don't quite know it all. If you are becoming a forecaster, this book will show you the way to go and take you along the proper paths. If you are not a professional forecaster, but interested in forecasting, this book will stimulate and satisfy your interest. It shows you the kinds of charts and graphs to look at so that this is not merely arithmetic, but arithmetic supported by where to apply that arithmetic to take you to a forecast. It's about how to go from the data you have to the forecasts you need. It's not just formulas and methods. It puts you in the right place and tells you what to do and how to do it to get where you need to be. In addition, there are amusing little drawing scattered through the book. Also, there are little historical pieces that give a sense of how these techniques were developed. As a bonus, there are interesting short quotes throughout the book, including this one, which appears on page 55, "Everything should be made as simple as possible, but not simpler," Albert Einstein. That describes this book. If you are a professional forecaster, or would like to be one, this book should be on your shelf, or, better, on your desk.

Takeaways

  • Be cognizant of your data and maintain a data quality protocol throughout the forecasting process.
  • Fully understand the forecasting algorithms and their forecast profiles. The data may change, but the profiles generally do not.
  • A selection bias in the M3 forecasting competition slants the trend algorithms as the “best” performers.
  • Simply averaging a point forecast accuracy measure is a badly flawed procedure as these measures have very skewed distributions.
  • The MAPE has been skewered in the literature and by the consulting community for having indeterminate values when actuals are zero.?As the objective in performance measurement is to determine a representative or typical value, the median alternative tends to be more representative than an arithmetic mean (Gaussian mindset). One should also consider M-estimation as an outlier/unusual event approach for determining an ‘average’ performance.

No alt text provided for this image
Srinivas Chillara

Managing Partner at SwanSpeed Consulting High performing cross functional teams for intelligent systems.

1 年

I'm not sure why/how I missed reading this. Very useful and helps pull a few threads together. cheers

回复
Keith Ord

Professor Emeritus at Georgetown University

2 年

Time indeed to bury MAPE!

Israel Aloagbaye Igietsemhe

Lead Data Scientist @ketteQ | Supply Chain Analytics | Machine Learning Engineer | Researcher

2 年

This is really insightful...Nothing like "one size fits all" when it comes to forecasting. No model would do well for all data and what you said is very true in practice "a?data-centric?forecast accuracy and performance evaluation process is essential for improved decision making."

要查看或添加评论,请登录

社区洞察

其他会员也浏览了