The LZI Method - When You Need Something More Realistic Than Croston Methods for Forecasting Intermittent Demand Data
Intermittent demand (ID), also known as sporadic demand, comes about when a product experiences several periods of zero demand. Often in these situations, when demand occurs it is small, and sometimes highly variable in size. There are a number of ways in which intermittent demand comes up in practice, for example, when creating lead time demand forecasts for inventory planning.
The New LZI Method for Forecasting Intermittent Demand
Before making any modeling assumptions for a forecasting problem, we should first thoroughly investigate the underlying patterns in the historical data within the context of a particular application. Data quality in forecast errors and other sources of unusual data should never be ignored in demand forecast modeling and accuracy measurement , especially when forecasting intermittent demand.?
Data quality in forecast errors and other sources of unusual data should never be ignored in demand forecast modeling and accuracy measurement , especially when forecasting intermittent demand.?
The spreadsheet table shows the intermittent demand series for a SKU and location in a population health forecasting application at a NYC area hospital from January 2015 to April 2018. We are asked to make a forecast through December 2019.?Similar data may occur in the retail industry at doors (stores) in different regions of the country as a result of a disruptive event, like an earthquake or a pandemic.
The first step is to explore the nature of the inter-demand intervals and the distribution of non-zero demand sizes in relation to inter-demand interval durations. This is missing from the Croston method, where intervals and demand sizes are assumed to be independent for mathematical convenience at the time. Starting from Feb 2015, we examine the dependence relationship between intervals and demand size.
A Lag-time Zero Interval (LZI) is defined as the interval duration preceding a demand size.?The results are shown below. In this spreadsheet, there are three types of LZI intervals. For example, the first LZI interval has one zero preceding the demand size 211. The next LZI interval has two zeros preceding the demand value 458. The next one has none and so on. If the process started in Jan 2015 with a new product introduction, for instance, we could add demand size 390 in the one-zero LZI bucket. However, this should not make a material difference in an ongoing intermittent demand forecasting process.
Treating the data in this way differs from a Croston-based method in that the LZI approach does not assume the independence between intervals and demand sizes.
A Structured Inference Base (SIB) Model for Forecasting Intermittent Demand.
Similar to my previous LinkedIn article on forecast error measurement in forecasting, I now introduce a SIB algorithmic model for forecasting intermittent demand sizes based its dependence on an LZI distribution.
A location-scale measurement error model can be viewed as a black box model. The nonzero demand size ID in an intermittent demand history can be represented as the output ID = β + σ ?, in which β and σ are unknown constants and the input ? is a measurement error with a known or assumed distribution. This equation can be rewritten as ID = β (1 + σ/β ?) which is a scale measurement error model.
If Ln refers to the natural logarithm, then the logarithm of the demand sizes can be turned into location measurement error model, known as a simple error measurement model. The location parameter is the unknown constant Ln β in the equation Ln ID = Ln β + {Ln (1 + σ/β ?)}. The term inside the curly brackets represents the measurement error in this model. Thus, Ln ID = β* + ?* is a SIB location error measurement model with error ?*= Ln (1 + λ ?) and location parameter β* = Ln β.?We will call σ/β a shape parameter λ for the ?* distribution.
In practice, when there are more inter-demand interval durations, the error variable ?* may need to represented by a multivariate measurement error for each possible LZI duration in the data, but for now we will assume ?*(τ) will depend on a typical LZI, represented by a single constant τ.
Keeping in mind the pervasive presence of outliers and non-normal variation in the real-world environment, I will proceed without the conventional normal (Gaussian) distribution assumptions. Rather, we will assume a flexible family of distributions for ?*, known as the exponential family . It contains many familiar distributions including the normal (Gaussian) distribution, as well as ones with thicker tails and skewness. There are also some technical reasons for selecting the exponential family , besides its flexibility.
The SIB Model Approach: Phase I - The Data Reduction Step
The SIB approach is algorithmic and data-driven , in contrast to a data-generating model with conventional normality assumptions.?The location measurement model is known as a simple model because of its simple structure.
This simple measurement model Ln ID = β* + ?*(τ, λ) shows that the output Ln ID results from a translation of an input measurement error ?*(τ, λ) shifted by a constant amount β*, in which a conditioned measurement error distribution depends on a fixed shape parameter λ and a typical lagtime inter-demand interval τ.
The simple measurement model and its generalizations were worked out over four decades ago and can be found in a 1976 book by D.A.S. Fraser, entitled Inference and Linear Models , and also in a number of Fraser’s academic journal articles dealing with statistical inference and likelihood methods.
Statistical inference refers to the theory, methods, and practice of forming judgments about the parameters of a population and the reliability of statistical relationships.
Starting with the SIB location model, we can analyze the model for demand size ID as follows:
??
where ?*(τ, λ) = {?*1(τ, λ), ?*2(τ, λ), ?*3(τ, λ). . . , ?*n(τ, λ)} are now n realizations of measurement errors from an assumed distribution with fixed shape parameter λ and lagtime inter-demand interval τ in the exponential family. What information does the black box reveal about process? This is where it gets interesting, and may perhaps appear somewhat unfamiliar for those who have been through a statistics course on inferential methods.
领英推荐
2. If you could have a data detective explore the innards of the black box, you could discover that, based on the data, there is now information about unknown, but realized measurement errors ?*(τ, λ) that leads to a reduction in dimensionality of the data . This important observation will guide us to the next noteworthy SIB modeling step: a decomposition of the measurement error distribution into two components: (1) a marginal distribution for the observed components with fixed (τ, λ) and (2) a conditional distribution (given the observed components) for a reduced unknown measurement error distribution that does not depend on (τ, λ). So, what are these observed components of the error distribution that we should reveal?
This is the essential insight gleaned from the structure of the SIB model and the observed data. When we select a location measure like the arithmetic mean, median, or smallest value (first order statistic) for location, we can make a calculation that yield observable data about the measurement process. Let’s call this location measure m(ID*), where ID*= Ln ID.?Then, with substitution and some simple manipulations, the SIB model?becomes (leaving out the (τ, λ) notation):
?= ?*1 –m (?*)
The modeling step is to note that the left-hand side of each equation can be calculated from the data, so the right-hand side is new information about a realized measurement error ?*and its distribution.
What is known we can condition on, which results in a one dimensional conditional distribution given the known error component and a (n-1) dimensional marginal distribution for this error component with fixed shape parameter λ and lagtime inter-demand interval τ .?
We cannot proceed further analytically at this point.?but in today's computing environment, data science (the convergence of statistical learning algorithms and computer algorithms), has made it doable with MCMC sampling.
This is somewhat like what a gambler can do knowing the odds in a black jack game. You can make calculations and inferences from what you observe in the dealt cards. We do the same thing here with our black “inference game” box.
Takeaway
The SIB model is well documented in two of D.A.S. Fraser’s books along with his peer reviewed academic journal articles in statistical inference and likelihood methods. Because the end results do not lend themselves to tractable theoretical formulae (except in the case of the normal (Gaussian) distribution), they have not seen much daylight in practice until data science came into its own\. (I was trained as a data scientist early in my career, but then it was called applied statistics.) Nowadays, inferential modeling can be dealt with in today’s, empirically rich and fast, computing environment. In other words, SIB modeling is a practical approach, not possible in my grad school days.
With modern computing power, we can now begin to show what we should be doing for intermittent demand forecasting, not just what we could do based on the mathematics of unrealistic normality assumptions.
Hans Levenbach, PhD is Owner/CEO of Delphus, Inc and Executive Director,?CPDF Professional Development Training and Certification Programs .
Dr. Hans is the author of a forecasting book (Change&Chance Embraced ) recently updated with the new LZI method for intermittent demand forecasting in the Supply Chain.
With endorsement from the International Institute of Forecasters (IIF), he created CPDF , the first IIF certification curriculum for the professional development of demand forecasters. and has conducted numerous, hands-on?Professional Development Workshops for Demand Planners and Operations Managers in multi-national supply chain companies worldwide.
Hans is an elected Fellow , Past President and former Treasurer, and former member of the Board of Directors of the?International Institute of Forecasters .
The 2021 CPDF Workshop Manual is available for self-study, online workshops, or in-house professional development courses.
He is Owner/Manager of these LinkedIn groups: (1)?Demand Forecaster Training and Certification, Blended Learning, Predictive Visualization , and (2)?New Product Forecasting and Innovation Planning, Cognitive Modeling, Predictive Visualization .
I invite you to join these groups and share your thoughts and practical experiences with demand data quality and demand forecasting performance in the supply chain. Feel free to send me the details of your findings, including the underlying data without identifying proprietary descriptions. If possible, I will attempt an independent analysis and see if we can collaborate on something that will be beneficial to everyone.
?
Transformational Supply Chain Leader | Driving Global Operations and Strategic Initiatives
4 年Very interesting. Karol - please have a look. A subject for the next post?
Hand, good to see you're still working on new ideas in forecasting. My colleagues here work on this important topic so I've shared with them. Best, Robert
Senior Data & Analytics Manager at COPELAND
4 年Thank you Hans for your insightful article. Some of our issues are forecasting for these intermittent items. So your instructive article will be helpful for us as we explore it.
Senior Data & Analytics Manager at COPELAND
4 年Reynaldo Jr. Guieb interesting read for the tail end products with intermittent demand.