Getting From Confusion To Clarity On Forecast Quality
found on giphy.com, source unknown

Getting From Confusion To Clarity On Forecast Quality

This article is a section from my upcoming book "An Introduction to Probabilistic Planning and Forecasting" edited for the current context. Readers are invited to comment here. Any comments provided may be used to improve the text in the book and if used credit will be given, with permission.

In the previous LinkedIn article, the difference between accuracy and numerical precision was explained, applied to time-series forecasting. That article is a prerequisite for a proper understanding of the current article.

In the previous section, two important differences between accuracy and precision were covered. First, accuracy expresses the distance between forecast and actual values, whilst precision expresses the distance within the forecast itself. This automatically leads to the second difference, which is that precision can be determined at the time of forecasting, whilst accuracy can only be determined after the actual values have been recorded. This section will dwell a little deeper on these and add a few more important ways they complement each other and how combining them leads to confusing results.

When we create a forecast, we typically express it in one of three ways:

  • As a series of probability distributions (probabilistic)
  • As a series of values, plus one static residual distribution (e.g. statistical)
  • As a series of values but no residual distribution (e.g. judgmental)

The values, position, and shape of the distributions can be compared to actual values to determine accuracy. This tells us how good or bad our forecast was. The value range of the distributions determines the precision and expresses our confidence in the forecast. Precision is thus a measure of certainty or the lack thereof of uncertainty. The more confident we are, the narrower the distribution. The more confident we are in our forecast, the less we need to buffer. This makes precision the dominant factor to determine safety stock levels, lead times, and reserved capacity. Like precision, these are set before actual demand and requirements are known. When our level of confidence is understated, in other words, we are overly cautious, the buffers will be larger than needed. This comes at a cost to the business, making it less efficient to serve the customer. The opposite, when we are overly confident the buffers will be too small resulting in problems, such as stock-outs, or insufficient lead time to supply, or lack of capacity to produce to order. This generally leads to even larger costs from expediting, inter-depot transfers, or sourcing from expensive suppliers. But more detrimental is the loss of service and sales revenue, which if chronic will lead to loss of customers and market share in the long run. These all indicate that our forecast was wrong, possibly too narrow or too wide, in other words, inaccurate. Thus precision impacts upfront costs, whilst accuracy impacts the cost of the response and missed revenue after the fact, if we got it wrong upfront. Stated differently, precision directly impacts efficiency, whilst accuracy directly impacts stability of the supply chain. Both impact the other indirectly. In the perfect world, our accuracy and our confidence in that accuracy are in alignment.

A number of important observations can be made based on the above:

  1. Judgmental forecasts express absolute certainty *). This is not only highly unrealistic but also means they cannot be used to determine buffer levels. If they are used to determine buffer levels, some measure of uncertainty must be calculated, possibly through measurement of the historical accuracy of the historical judgments. Without some scientifically sound means of doing so, any buffers based on judgment forecasts will lead to either extreme inefficiency or extreme instability. or both.
  2. Statistical forecasts express uncertainty through their residual distributions. However, few companies measure the accuracy of those distributions, and thus never focus any attention on improving them. This makes using residual distributions to determine buffer levels fraught with risk. The alternative, to use historical accuracy is also problematic because there is no guarantee that the current forecast uses the same algorithm as historical forecasts. The only solution to the root cause is to start measuring and improving the accuracy of the residual distribution.
  3. Last but not least, if you know the biggest pain points in your supply chain you can determine whether you need to focus on improving precision or accuracy for the biggest impact. If your inventory is ballooning out of control, focus on increasing precision. If your service levels are too low, focus on increasing accuracy.

*) One minor footnote to the lack of precision of judgmental forecasts. If judgmental forecasts are expressed as large enough numbers with only a few decimals, one could assume there is a range of uncertainty around the significant digit from one decimal lower to one decimal higher that could be considered to drive buffers. This so-called arithmetic precision however is both insignificant compared to other aspects of precision and impractical in its use, only allowing expression in multiples of discrete powers of ten and no judgmental forecast ever considering the impact of significant digits on the supply chain.

One last very important difference between accuracy and precision is that the scale of accuracy can be determined, whilst the scale of precision cannot. This means that for accuracy it is possible to determine both a best-case and a worst-case. Any level of accuracy recorded can then be measured on that scale and judged. How much improvement can be made is known, and can be used to focus improvement efforts in the right place. A very helpful side effect of this known scale is that it can be rescaled to a relative one ranging from 0% to 100%, where 0% error or 100% accuracy is truly achievable. Any such relative accuracy metric would not only be very useful, but also very intuitive. On the other hand, whilst the level of precision can be determined, it is unknown how much better it could be. This makes it impossible to express precision on any relative scale where both ends make intuitive sense. And it makes it impossible to know how much it can be improved.

The reason we cannot determine the scale of precision is that it expresses uncertainty. And uncertainty is based on how much signal we can isolate from the historical patterns. No matter how much more causal information we can find and include, there will always be some residual noise. Much of that noise is out of our control. Think of competitor promotions, extreme weather events, product recalls, random consumer behavior, and so forth. We know they exist and that they will have an impact. We just do not know when, where, and how significant any future case will be. Apart from the fact that we do not know all the various contributing factors and their impact on the signal, there is also a case of diminishing returns. Even if we could get our hands on more data to increase precision it may be cost-prohibitive or more effort than it is worth. And we cannot estimate the potential impact until after the cost and effort have been spent. Accuracy, however, is the difference between a forecasted distribution and the actual distribution. Both are known once the actual values are recorded. The best case is when the forecast distribution matches the actual. The worst case is when the forecast distribution and the actual distribution are completely disjoint. With both extremes known, the whole scale is known. An example accuracy metric that exploits this behavior is the Total Percentile Error (TPE) covered in section 13.2. [also here on LinkedIn, and a newer peer-reviewed version in the summer 2017 issue of Foresight Magazine, link to Dropbox also includes working examples in MS Excel.]

Commonly used metrics in the time-series forecasting domain tend to make two simplifications that prevent them from exposing the benefits of precision and accuracy listed above. As explained in the previous section [this article on LinkedIn] they tend to measure only the error compared to a single point of the forecast, rather than the entire distribution. But they also tend to combine precision and accuracy into a single metric. The impact of this latter treatment is shown in figure 6.18.

table with complementary benefits of accuracy and precision showing how the cancel each other out when combined into a single metric

Figure 6.18: The loss of complementary benefits of accuracy and precision when combined

Metrics that blend accuracy and precision suffer from the worst characteristics of both. They lose the benefit of precision of being known at time of forecast. They lose the benefit of accuracy of being expressible on a known scale. They lose the benefit of accuracy of impacting supply chain stability and with that losing the correlation to the biggest contributing factor to business value. They retain part but not all of the value of impacting supply chain efficiency. To sum it up, these metrics are unknown until it is too late, on a scale that is not intuitive, with no correlation to business value. This makes these combined metrics poor choices to measure the goodness of a forecast or to judge value-added on. They are however very useful to pinpoint specific issues as mentioned in the previous section and further explored in section 13.2, where a rigorous forecast quality framework is introduced. The metrics in this category include all the commonly used metrics that measure only the error of the point forecast, including MAPE, WMAPE, MSE, RMSE, MAD, MAE, MASE, and FVA. Many of these are further explored in section 13.2, and Appendix B shows a near-exhaustive list of forecast quality metrics, classified, and ranked based on their qualities and benchmark performance.

This excerpt of the book explained how accuracy and precision when properly defined complement each other, allowing each to provide clear actionable information. Traditional error metrics combine accuracy and precision leading to results that are confusing, not correlated to business value, and lead to poor business decisions. In following articles I will present excerpts from the book that cover:

If you are interested in probabilistic planning and forecasting please consider joining the?"Probabilistic Supply Chain Planning" group?here on LinkedIn.

Find all my articles by category here. Also listing outstanding articles by other authors.

Excellent, well articulated article. Right on, but needed to be re-read before I got it. Bedankt, Stefan.

Stefan de Kok

? Supply Chain Innovator ?

4 年

Probably end of this year. Still much to write and then the whole pre-publish process.

回复
James J. C.

Network AI Evangelist @ One Network | Guiding Complex Supply Chains

4 年

Stefan de Kok?"these metrics are unknown until it is too late, on a scale that is not intuitive, with no correlation to business value." Do you speak in your upcoming book on the impact of time and data latency? All partners in the supply chain connected on one single network takes out the problem of "too late" because the demand signal updated 3 times a day and autonomous forecasting is communicated in real time to all parties in the supply chain based on their perspective. The store see cases, DC sees pallets the manufacturer sees raw materials.

要查看或添加评论,请登录

Stefan de Kok的更多文章

  • The Futility of Mapping Forecast Error to Business Value

    The Futility of Mapping Forecast Error to Business Value

    A number of years ago my mom asked me if I held it against her that she beat me so much as a child. My honest answer:…

    18 条评论
  • Safety Stock vs Inventory Optimization

    Safety Stock vs Inventory Optimization

    Many companies go about setting safety stock levels the wrong way. Often this is driven by a misunderstanding by upper…

    24 条评论
  • You Think You Understand Safety Stock?

    You Think You Understand Safety Stock?

    There are a lot of misunderstandings about safety stock. Most people will quibble about what formula makes a safety…

    9 条评论
  • Why Not To Use The Normal Distribution

    Why Not To Use The Normal Distribution

    If you use demand variability, forecast error, or lead time variability to determine safety stocks or other buffers…

    39 条评论
  • Your Forecast Is Already Probabilistic

    Your Forecast Is Already Probabilistic

    If you are like most forecasting practitioners, you may believe probabilistic forecasting is extremely difficult to do,…

    10 条评论
  • How to Measure Forecastability

    How to Measure Forecastability

    In recent weeks the same question in the title above has come up repeatedly. Since I like the DRY principle (don't…

    30 条评论
  • Is Your Accuracy Lagging?

    Is Your Accuracy Lagging?

    A problem I often encounter is companies measuring their demand forecast accuracy incorrectly. Most demand planning…

    37 条评论
  • How to Integrate Information

    How to Integrate Information

    Removing Silos in Planning - Part 6 This article explains how to remove information silos in an easy incremental way…

    5 条评论
  • Decision Silos

    Decision Silos

    Removing Silos in Planning - Part 5 In previous parts, we described process silos, data silos, and information silos…

    2 条评论
  • Information Silos

    Information Silos

    Removing Silos in Planning - Part 4 So, you fixed your process silos and data silos. All your processes are aligned in…

    3 条评论

社区洞察

其他会员也浏览了