Tuning a Statistical Forecast Part 2: Methodology
Forecast Tuning Methodology.
In this second article I will explain the methods that can be used to tune a statistical forecast. The first of this series of articles presented the various resource options to tuning a Statistical Forecast and subsequent articles will cover the strategic elements, business decisions, prioritisation, impact and ability and actual procedures.
Methodology is the activities that are grouped together in rough order of difficulty from simple to complex. Tuning generally requires many of the methods to be applied but this can be a very difficult task to achieve in a blended manner (as in, multiple methods at the same time). It is easier to pursue tuning by individual methods first, then blend various methods together as your understanding of tuning effectivity grows.
The 9 Steps
The Methodology of Engine Tuning for accuracy consists of the following 9 steps:
1.History: Clean & adjust to create the best forecast. Difficulty Factor = Easy
Since most Statistical Forecasts use History (Orders, Shipments, Invoices etc.) as the source of the statistical calculations, the first and easiest tuning step of all is to adjust your history. Yes, this means that you are already tuning your forecast!
You should retain the original History for reference and then have adjustment data streams (absolute and/or percentage) that then feed into a Final History data stream. The Final History should show Actual History unless there are adjustments in which case use the absolute adjustments and then the percentage changes.
Why would you want to perform changes? There are a large number of reasons why the history is either incorrect or inappropriate for the Statistical Forecast you would like to have. Some examples are:
2. Decision Process: Lifecycle / Segments / On or Off. Difficulty Factor = Easy
A 'Node' is the combination of Organisation, Product and Location (and could other Dimensions). Node Processing is the decision of whether or not to forecast that node or combination. What does this actually mean? It means determining how your data should react inside your planning system. You control what gets forecasted and how.
Turning the nodes 'Off' means you don't want a Statistical Forecast to run and turning them 'On' means you do. Artificial Intelligence and Machine Learning Engines can automate some of this activity once you set parameters for the System too react to. Typical practical examples of why you might adjust the node settings are:
Node Processing can also be a crucial exception management pivot. Do you have Forecast for a Customer who is on hold? Do you have no Forecast where a Node has history and is set to active?
3.Hierarchy Levels: Edit to drive better usage. Difficulty Factor = Moderate
What level is your Forecast generated at? Some solutions are set at a particular level (say, Organisation, Item & Customer Channel) while others may use hierarchies flexibly using Automation and Machine Learning to select the 'right' level as the data demands.
Tuning the engine to create better forecasts will involve assessing these hierarchies for appropriateness. This approach may not be easy to perform (especially where data is fixed within an Integrated Solution) but it will be worth analysing to confirm if there is a problem or not. If there is a challenge, at least you are aware of it and it can be added to the list of future improvements.
A Flexible hierarchy solution should be analysed regularly and indeed, used for exception managements since forecasts generated higher in the Forecast Tree will be due to lack of lower-level data. The higher a forecast is created, the less accurate the prediction will be and these combinations should be reviewed using the other options in this list.
4. Engine & Models: In Engines or Plans. Difficulty Factor = Moderate
Solutions vary of course, but it should be possible to select and deselect the models used by your statistical engine. Time Series, Exponential, Intermittent and Regression Models will all create quite different forecasts from the same set of historical data.
领英推荐
Assess the different models that are available to determine the best model selections. Best Fit solutions will offer a one model per combination while more sophisticated systems will use Machine Learning to mix and match.
If there are conflicts with model settings, consider creating independent sets of data where the best models can be set per data segment. For example, you could build 2 engines: one for intermittent data and one for smooth or perhaps one for B2B and one for B2C.
5. Engine Settings: Find the optimal settings. Difficulty Factor = Complex
Engine Parameters define how the models react to data in the system. Parameters will define the length of your forests horizon, the significance applied by the engine to recent history, what to do with null history, the definition of the moving average and so on. There could be many hundreds of parameters.
Complex solutions should have their parameters properly assessed and tuned for project go live but how long ago was that? Extract the parameters, assess them and create a plan to change and validate.
6.Causals: Maintain to obtain better results. Difficulty Factor = Moderate
Causal Factors are elements that can be defined as?having an effect upon demand and can used to improve the accuracy of the forecast.??Causals need to be defined, added into history and also loaded into the future. Broadly speaking, there are two types of causal factor:?
Too many casuals can create a lot of noise for statistical forecast engines to assess. As a starting point, fewer is better. Only include causals when you know they add value. Verify that the casual data is complete and as correct as possible. For example, if price change is a casual, validate that price data is not missing anywhere in the dataset. If Promotions exist - are post promotional reviews conducted to validate and correct assumptions?
7.Proport Function?(Allocation & Aggregation) Difficulty Factor = Moderate
Proport defines how data is rolled up and down your demand planning hierarchies in the past and the future. An example of proportionality; if one item-location combination has four times as many sales as another, the former combination should receive four times as much of the statistical forecast. Examples of where proport can impact demand planning data:
Typically, historical data will use itself as the proport mechanism, but the Forecast can use itself, or previous approved forecast or last year or annual budget... Select the least compromising weighting method. Try to make adjustments as low as possible.
8.Nodal: Refine individual combinations. Difficulty Factor = Complex
Nodal tuning is a term used to describe the maintenance of the previous 7 options per individual data intersection. This feature, if it is available to you, can transform forecast accuracy since each combination can be optimally tuned. The downside of this local tuning approach is that the cost of maintenance can be extremely high.
Nodal tuning should be used when it is proven that a particular set of data performs better with a unique combination of settings and where this data cannot be removed and managed in a separate plan.
9.Strategy & Procedures?(Approach to Tuning) - Measure & Adjust
This step really should be the first one, but you need to know the impact and difficulty of the 8 steps before you can really set a strategy and procedure. A strategy defines the purpose of tuning (better accuracy, more trust, greater efficiency etc.) and the plan to achieve the strategic purpose. A procedure defines the way that the methods described above will be applied and assessed in order to deliver the against the strategy. Some basic question to ask:
I expect that existing Demand Planners will be able to work with Methods 1 & 2 immediately as these steps are naturally performed by planners, but are they formally captured and analysed for impact?
A good place to start engine tuning is to capture baseline data (settings and forecast results) and then to try and record changes applied and the results achieved. It will take some time before the correct spreadsheet structure and reporting mechanism will be found to manage procedures. Test it out before embarking on a more complex and thorough tuning journey.
Project Finance Business Partner at HS2 (High Speed Two) Ltd, Mega Project Cost & Financial Control Specialist
3 年Big share for this one Simon ??
Preparing you for Lift-Off with o9 Solutions, Inc.
3 年It's not really 7 minutes. I blame the images.