Getting From Confusion To Clarity On Forecast Quality
This article is a section from my upcoming book "An Introduction to Probabilistic Planning and Forecasting" edited for the current context. Readers are invited to comment here. Any comments provided may be used to improve the text in the book and if used credit will be given, with permission.
In the previous LinkedIn article, the difference between accuracy and numerical precision was explained, applied to time-series forecasting. That article is a prerequisite for a proper understanding of the current article.
In the previous section, two important differences between accuracy and precision were covered. First, accuracy expresses the distance between forecast and actual values, whilst precision expresses the distance within the forecast itself. This automatically leads to the second difference, which is that precision can be determined at the time of forecasting, whilst accuracy can only be determined after the actual values have been recorded. This section will dwell a little deeper on these and add a few more important ways they complement each other and how combining them leads to confusing results.
When we create a forecast, we typically express it in one of three ways:
The values, position, and shape of the distributions can be compared to actual values to determine accuracy. This tells us how good or bad our forecast was. The value range of the distributions determines the precision and expresses our confidence in the forecast. Precision is thus a measure of certainty or the lack thereof of uncertainty. The more confident we are, the narrower the distribution. The more confident we are in our forecast, the less we need to buffer. This makes precision the dominant factor to determine safety stock levels, lead times, and reserved capacity. Like precision, these are set before actual demand and requirements are known. When our level of confidence is understated, in other words, we are overly cautious, the buffers will be larger than needed. This comes at a cost to the business, making it less efficient to serve the customer. The opposite, when we are overly confident the buffers will be too small resulting in problems, such as stock-outs, or insufficient lead time to supply, or lack of capacity to produce to order. This generally leads to even larger costs from expediting, inter-depot transfers, or sourcing from expensive suppliers. But more detrimental is the loss of service and sales revenue, which if chronic will lead to loss of customers and market share in the long run. These all indicate that our forecast was wrong, possibly too narrow or too wide, in other words, inaccurate. Thus precision impacts upfront costs, whilst accuracy impacts the cost of the response and missed revenue after the fact, if we got it wrong upfront. Stated differently, precision directly impacts efficiency, whilst accuracy directly impacts stability of the supply chain. Both impact the other indirectly. In the perfect world, our accuracy and our confidence in that accuracy are in alignment.
A number of important observations can be made based on the above:
*) One minor footnote to the lack of precision of judgmental forecasts. If judgmental forecasts are expressed as large enough numbers with only a few decimals, one could assume there is a range of uncertainty around the significant digit from one decimal lower to one decimal higher that could be considered to drive buffers. This so-called arithmetic precision however is both insignificant compared to other aspects of precision and impractical in its use, only allowing expression in multiples of discrete powers of ten and no judgmental forecast ever considering the impact of significant digits on the supply chain.
One last very important difference between accuracy and precision is that the scale of accuracy can be determined, whilst the scale of precision cannot. This means that for accuracy it is possible to determine both a best-case and a worst-case. Any level of accuracy recorded can then be measured on that scale and judged. How much improvement can be made is known, and can be used to focus improvement efforts in the right place. A very helpful side effect of this known scale is that it can be rescaled to a relative one ranging from 0% to 100%, where 0% error or 100% accuracy is truly achievable. Any such relative accuracy metric would not only be very useful, but also very intuitive. On the other hand, whilst the level of precision can be determined, it is unknown how much better it could be. This makes it impossible to express precision on any relative scale where both ends make intuitive sense. And it makes it impossible to know how much it can be improved.
领英推荐
The reason we cannot determine the scale of precision is that it expresses uncertainty. And uncertainty is based on how much signal we can isolate from the historical patterns. No matter how much more causal information we can find and include, there will always be some residual noise. Much of that noise is out of our control. Think of competitor promotions, extreme weather events, product recalls, random consumer behavior, and so forth. We know they exist and that they will have an impact. We just do not know when, where, and how significant any future case will be. Apart from the fact that we do not know all the various contributing factors and their impact on the signal, there is also a case of diminishing returns. Even if we could get our hands on more data to increase precision it may be cost-prohibitive or more effort than it is worth. And we cannot estimate the potential impact until after the cost and effort have been spent. Accuracy, however, is the difference between a forecasted distribution and the actual distribution. Both are known once the actual values are recorded. The best case is when the forecast distribution matches the actual. The worst case is when the forecast distribution and the actual distribution are completely disjoint. With both extremes known, the whole scale is known. An example accuracy metric that exploits this behavior is the Total Percentile Error (TPE) covered in section 13.2. [also here on LinkedIn, and a newer peer-reviewed version in the summer 2017 issue of Foresight Magazine, link to Dropbox also includes working examples in MS Excel.]
Commonly used metrics in the time-series forecasting domain tend to make two simplifications that prevent them from exposing the benefits of precision and accuracy listed above. As explained in the previous section [this article on LinkedIn] they tend to measure only the error compared to a single point of the forecast, rather than the entire distribution. But they also tend to combine precision and accuracy into a single metric. The impact of this latter treatment is shown in figure 6.18.
Figure 6.18: The loss of complementary benefits of accuracy and precision when combined
Metrics that blend accuracy and precision suffer from the worst characteristics of both. They lose the benefit of precision of being known at time of forecast. They lose the benefit of accuracy of being expressible on a known scale. They lose the benefit of accuracy of impacting supply chain stability and with that losing the correlation to the biggest contributing factor to business value. They retain part but not all of the value of impacting supply chain efficiency. To sum it up, these metrics are unknown until it is too late, on a scale that is not intuitive, with no correlation to business value. This makes these combined metrics poor choices to measure the goodness of a forecast or to judge value-added on. They are however very useful to pinpoint specific issues as mentioned in the previous section and further explored in section 13.2, where a rigorous forecast quality framework is introduced. The metrics in this category include all the commonly used metrics that measure only the error of the point forecast, including MAPE, WMAPE, MSE, RMSE, MAD, MAE, MASE, and FVA. Many of these are further explored in section 13.2, and Appendix B shows a near-exhaustive list of forecast quality metrics, classified, and ranked based on their qualities and benchmark performance.
This excerpt of the book explained how accuracy and precision when properly defined complement each other, allowing each to provide clear actionable information. Traditional error metrics combine accuracy and precision leading to results that are confusing, not correlated to business value, and lead to poor business decisions. In following articles I will present excerpts from the book that cover:
If you are interested in probabilistic planning and forecasting please consider joining the?"Probabilistic Supply Chain Planning" group?here on LinkedIn.
Find all my articles by category here. Also listing outstanding articles by other authors.
Excellent, well articulated article. Right on, but needed to be re-read before I got it. Bedankt, Stefan.
? Supply Chain Innovator ?
4 年Probably end of this year. Still much to write and then the whole pre-publish process.
Network AI Evangelist @ One Network | Guiding Complex Supply Chains
4 年Stefan de Kok?"these metrics are unknown until it is too late, on a scale that is not intuitive, with no correlation to business value." Do you speak in your upcoming book on the impact of time and data latency? All partners in the supply chain connected on one single network takes out the problem of "too late" because the demand signal updated 3 times a day and autonomous forecasting is communicated in real time to all parties in the supply chain based on their perspective. The store see cases, DC sees pallets the manufacturer sees raw materials.