How a Profile Analysis Can Improve Forecasting Performance with More Effective Models

How a Profile Analysis Can Improve Forecasting Performance with More Effective Models

In a previous article on forecast accuracy and performance analysis, I explored how EDA tools can be used to detect and improve data quality for demand planners and forecast practitioners. Exploratory data analysis (EDA) tools are very useful in identifying data quality issues for demand planners and forecast practitioners. For example, using just a single series N2796 from the M3 forecasting competition and a trend/seasonal exponential smoothing algorithm, the first insight with this ‘ideal’ time series is that bad data will beat a good forecast every time!

Using a new, non-conventional information-theoretic measure of profile accuracy, it shows how a single, simple distortion in the trend/seasonal pattern can impact the effectiveness of a forecasting Method. I am looking for insights using profile accuracy and profile performance measurement to contrast performance with traditional point forecast measures. In the M3 forecasting competition, the performance results are based on single one-step-ahead profile forecasts, which limit the credibility of sound model selections for practitioners.?

The M3-forecasting competition results are based on single one-step ahead profile forecasting performances which should be considered a limitation in getting a credible model selection process for practitioners.
No alt text provided for this image
No alt text provided for this image

Profile Accuracy and Performance Measurement with 10 Methods from the M3 Forecasting Competition

In a Profile Analysis for accuracy and performance measurement, the Actual Profile (AP) and Forecast Profile (FP) get encoded into positive fractions by dividing the lead-time Total into the respective profile values. These yield an actual alphabet profile (AAP) and forecast alphabet profile (FAP) that have the same pattern as the respective series they are derived from.?Both benchmark forecasts, NAIVE_2 and Naive_LT-#2, have comparable patterns encoded as FAPs (shown below)

No alt text provided for this image
No alt text provided for this image

Step 1. The profile accuracy is measured by a ‘distance’ metric between a Forecast Alphabet Profile (FAP) and Actual Alphabet Profile (AAP).

An accuracy measure for a forecast alphabet profile is given by the Kullback-Leibler divergence measure D(a|f), which can be interpreted as a measure of ignorance or uncertainty about Profile Accuracy (the closer to zero the better), which is what we are interested in assessing lead-time forecasting performance.?

No alt text provided for this image
When D(a|f) = 0, the alphabet profiles overlap, or what is regarded as 100% accuracy.?
No alt text provided for this image

Step 2. Evaluating the effectiveness of a Method compared to a benchmark method. The eminent statistician George Box (1919 - 2013) has long ago warned us that 'All Models and Wrong, Some Are Useful", but never defined Useful. Then, what are the Useful methods? A measure of effectiveness for profile forecasts is needed and can defined with a proper skill score:

L-Skill Score = 1 – [D(a|Method)/D(a|Benchmark)]

As a proper skill score, the Levenbach L-Skill score is an appropriate measurement for Method Effectiveness. When the L-Skill score is negative, a Method is not effective. Its range is - ∞ < L-Skill score < +1

A forecasting Method with a positive L-Skill score for a series is doing the right thing. The maximum achievable L-Skill score is +1.?

Takeaways

  • M3 Naive_2 method is not effective as a benchmark method for evaluating effectiveness. It is the least accurate even compared to the Naive_LT-2.
  • Among the ten methods selected in the M3 competition, several have similar trend / seasonal profiles (see Pegels diagram below), but perhaps different fitting algorithm implementations (starting values, optimization criteria, parameter estimates, etc.), but are not similar in most performance measures.

No alt text provided for this image

  • M3 DAMPEN method used the same Gardner-McKenzie algorithms as the PP- Autocast entry, which I used in the competition. M3 DAMPEN has a completely different sMAPE pattern than M3 HOLT (seasonalized linear trend) or M3 HOLT-WINTERS. This is reflected in the Profile Miss/D(a|f) indicators over the hold-out time horizon.
  • Naive_2 method, as expected, is the least effective among the M3 competition methods. However, it is intended to serve as a benchmark method. The sMAPE Skill score along with the one using Naive_1 method as benchmarks in the sMAPE Skill score box plots.?

No alt text provided for this image

  • Data Quality May Be More Important than Data Quantity. Why is M3 DAMPEN method, with exactly same algorithm as PP-Autocast, appear ineffective? Comparing sMAPE results for N2796 among Methods may give some insight. The sMAPE patterns for most of the effective Methods look like the 'symmetric APEs' for high ranked method 10 (THETA). DAMPEN method, on the other hand, differs noticeably at points 6, 7 and 18.

No alt text provided for this image
No alt text provided for this image

Hans Levenbach, PhD is Owner/CEO of Delphus, Inc and Executive Director,?CPDF Professional Development Training and Certification Programs.

No alt text provided for this image

Dr. Hans is the author of a forecasting book (Change&Chance Embraced) recently?updated with the LZI method for intermittent demand forecasting in the Supply Chain.

No alt text provided for this image

With endorsement from the International Institute of Forecasters, he created the first certification curriculum for demand forecasters (CPDF) and has conducted numerous, hands-on?Professional Development Workshops?for Demand Planners and Operations Managers in multi-national supply chain companies worldwide.

No alt text provided for this image

The 2021 CPDF Workshop manual is available for self-study, online workshops, or in-house professional development courses.

Hans is a Past President and Treasurer, and former member of the Board of Directors of the?International Institute of Forecasters.

He is Owner/Manager of these LinkedIn groups:

(1)?Demand Forecaster Training and Certification, Blended Learning, Predictive Visualization, and

(2)?New Product Forecasting and Innovation Planning, Cognitive Modeling, Predictive Visualization.

I invite you to join these groups and share your thoughts and practical experiences with demand data quality and demand forecasting performance in the supply chain. Feel free to send me the details of your findings, including the underlying data without identifying proprietary descriptions. If possible, I will attempt an independent analysis and see if we can collaborate on something that will be beneficial to everyone.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了