Unifying the demand signal in retail

1.????? Objective

The purpose of this article is to describe the vision for an end-to-end forecasting system for a retailer. The idea is to understand why one might wish to unify the demand signal and what one might build in order to do so at a high level.

2.????? Forecasts in retail

Forecasting and setting a plan are important activities in retail. Indeed, all of retail may be said to revolve around these activities in some shape or form. Forecasts and plans are created in multiple areas of a retailer – financial planning, merchandising, promotion planning, assortment planning, supply chain, store and DC operations, fulfillment and so on.

More often than not, these forecasts and plans are developed independently and reconciled manually and quite laboriously. At the very minimum there are the following reasons for this:

  1. Top down and bottom up forecasts differ in their objectives – the former is used in longer range planning processes, whereas the latter is used in shorter term execution processes
  2. Organizational structure presupposes a certain style of information flow, therefore not all information is available to everyone, hindering the ability to make decisions with a full view of all processes
  3. Forecasts are utilized at different granularity in multiple dimensions

  • Location – national, regional, store, and so on.
  • Merch hierarchy – item, class, department, category and so on.
  • Time aggregation – hourly, daily, weekly, monthly, annual.

Some additional factors to consider as we look at forecasts

  • Time horizon needed for the business process – going out to 6 weeks, a quarter, a year, etc.
  • Units of the forecasts – eg, units demanded/ sold, revenues in dollars, truckloads, boxes, labor hours, slots, etc.
  • Customer hierarchy – individual customer, segments at various levels, etc.
  • Product type – perishable vs center of store.

3.????? Some activities that depend on forecasts

In no particular order, the author can think of the following forecasts needed for retail processes.

  • Financial planning and analysis – long range, but tracked more frequently
  • Space planning and planogram design
  • National merchandize– eg, store plans, assortment, demand planning, buying
  • Local merchandizing – eg, promo planning
  • Inventory planning and control – eg, replenishment at store, DC replenishment
  • Digital / ecommerce – eg, order/slot forecasts, labor plans for fulfillment, offers
  • Retail Media – eg, number of site visitors, ad placements, shopper eyeballs, clicks
  • S&OP, Labor planning at DC & stores
  • Multi echelon Supply chain & network design
  • Transportation planning and logistics
  • Forecasts for Buying and vendor management
  • Manufacturing forecasts and Own Brands
  • Mass Advertising and marketing – divisional plans
  • Personalized deals and campaigns – incremental sales
  • Loyalty offers – loyalty resulting in incremental sales
  • Operational forecasts – handhelds at stores; or at systems in DCs

4.????? Challenges with forecasting and planning each activity separately

In order to effectively execute on the retailer’s strategy, for example, reduce out of stocks and improve instocks, the various processes listed above (and others!) have to work in harmony.

For instance, imagine these three processes –

  • Personalized marketing provides coupons to several prospective customers.
  • Promotional planning and execution process in corporate or at the division sets a price reduction in motion.
  • A forecast is generated for supply chain and store replenishment to send a certain quantity of each item to each store at a certain time.

It should be self-evident why we would want to run all of these processes in a manner that involves close handoffs and monitoring. It would be even better if we could run these processes off a unified customer demand signal. Otherwise, a location might run out of stock at some times and have excess inventory at other times, or the over- and under-stocks might happen across locations.

5.????? Unifying the Demand Signal

There are multiple facets to demand planning unification.

In its full generality this is practically impossible: one cannot say, with any measure of confidence that a “Customer C will buy product P at location L at time T using the instore / online channel” since the future is inherently unknowable at that fine a grain, i.e., there does not exist enough data to train the models to create projections at this level of detail.

What we can do instead is project this multi-dimensional problem to the dimensions of interest for each business problem. How might one go about doing so?

5.1 Unconstrained customer demand

To begin with everything should start at the unconstrained customer demand, which represents the possible demand from customers that could be met if no other constraints existed. This is an idealized forecast for “what could be” and the maximum achievable outcome.

From the unified customer demand, one could derive the forecast for each process, given the constraints of that process. For instance, store replenishment has to obey the constraints imposed by inventory availability at the DCs, which in turn is constrained by potential vendor shortages and so on. The application of constraints is usually done by means of an optimization process built on top of the unconstrained forecast.

Similarly, several processes in retail are stochastic in nature – while it is possible to estimate the average (mean) forecast, for many items it is practically impossible to predict with any precision the sale of a particular item at a particular location on a particular day. This can only be estimated as a probability distribution which can be used to run a simulation built on top of the forecasts.

Thus the journey of unifying the demand signal across decision processes requires a careful orchestration of unconstrained forecasts, constrained optimization and simulation of future evolution of the world.

5.2 Many forecasts, from one: Causal Building blocks

While the common demand signal at the heart of all aspects of retail will remain hidden, how does one create projections of that demand to the various required granularities and time horizons while maintaining a certain level of coherence and consistency?

The core idea is to look at the forecasts as a suite of interrelated forecasts built out of a common set of Machine Learning (ML) building block models. To wit, the following building blocks, and possibly others, are relevant to retail processes:

  • Base demand based on regional demographics and socioethnic characteristics
  • Changes or trends in demand that might be at different levels
  • Promotional elasticity of demand and lift for different types of promotions
  • Base price elasticity of demand
  • Calendar factors like the Effect of holidays, both fixed and floating
  • Seasonal behavior
  • Internal factors like the location of an item on an endcap or a display
  • External causal factors like the impact of unseasonal weather, back to school, etc.
  • Store specific or region specific factors of local relevance.
  • In store versus online shopping patterns
  • Demand transfer for items out of stock
  • Store clustering
  • Item similarity
  • Allocation of SKUs across a set of items, eg own brand and national brand
  • Price interventions
  • Competitor locations
  • Competitor actions, with importance paid to recency
  • Intra-week sales patterns
  • Adjustments for outliers and out of stocks in historical data

Not all of these causal factors impact all forecasts equally but where they are necessary, the factors should be unified across processes. To take our example from the previous section, the same price elasticity (often called “demand lift” due to price changes) should be used for the personalized marketing, promotional planning and store replenishment forecasts in order to create consistency across business processes.

5.3 Many forecasts, from one: Unpredictable events and recency

Additionally, one might wish to consider the impact of idiosyncratic events that cannot be anticipated or modeled, eg a celebrity posting a recipe on social media – note that while we cannot model these events, we can certainly react to them.

More generally, forecasts have to be sensitive to recent activity. A forecasting system should be able to do so with minimal lag. At the very least, the system should provide baseline recommendations of actions to inventory managers under a small set of circumstances they can choose from. At the same time, the forecasting system should not create bullwhip effects by reacting to each blip in activity. This is a fine balance to strike and requires careful modeling.

5.4 Navigating from one forecast to another

It is critical to recognize that the forecast generation processes and models need to operate at the right grain of aggregation to ensure good statistical properties of the forecasts. However the consumers of forecasts potentially need to be able to utilize the forecasts at a different grain altogether. This is a perfect application of data science – to be more precise, the use of machine learning techniques from AI – to determine the right level of aggregation or disaggregation of demand.

5.5 Adjustments

A forecasting system should allow for an informed user to adjust the outcomes that are created by the downstream systems that consume the forecasts. For instance, we wish to send two pallets of item X to store Y on date Z for a promotional display in the front of the store. Thus, user interfaces should allow for such adjustments, and quality measurements should account for these adjustments.

5.6 Other components of the forecasting ecosystem

In addition to the Machine Learning model library, recency adjustments, aggregation/disaggregation on demand and human adjustment levers, the forecasting ecosystem must have the following core components:

  • Ability to measure and track divergence of forecast from actuals
  • Granularity Optimization of level of forecasting – this may be different for different item/location/time combinations!
  • Automated selection of critical forecast features - perhaps a specific holiday is important for some items but not for others. This is often called feature selection
  • Hyperparameter tuning of the ML models used in order to make the models robust to small changes in their inputs
  • Selecting the right forecast lag for recency effects
  • Combining multiple forecasts into an ensemble model to improve the signal
  • Learning automated and human feedback from prior planning processes for improving the future forecasts – this ability to learn is what distinguishes an ML / AI system from a traditional rules based engineering system.
  • Scenario generation and simulation capabilities including what-if planning
  • Explanatory systems for human interpretation of decisions taken
  • Root cause analysis of defects in forecasting

5?????????? What does a good forecast look like?

We now describe what a good forecast looks like and how to define key metrics of goodness. A good forecast has three key characteristics: Low error, No bias, and low churn.

While most people understand the importance of errors, the two other characteristics, ie, bias and churn, are less well understood by modelers and analysts. Retail practitioners on the other hand very much live through bias and churn on a day-to-day basis.

  1. Accuracy and low error rates

  • Comparison to a na?ve forecast or a benchmark – in retail this is often previous period sales or a moving average
  • MAPE (mean absolute percent error) is often used : Absolute (forecast – sales ) / sales
  • wMAPE - It is a good idea to weight MAPE by sales so as to reduce the importance of low sales velocity items
  • MAE - Error across time (perhaps exponentially smoothed moving average?)
  • Relative error of the forecasting model to the na?ve model – this gives a sense whether the forecast accuracy is low due to inherent difficulty in modeling or due to model errors

2. ?Low or no bias – in other words the forecast must be neutral over time - this is important in order to distinguish between under- and over-forecasting which have very different effects. Over-forecasting has high carrying costs and perish /shrink/ markdown cost. Under-forecasting, on the other hand, can lead to stockouts resulting in lost sales, or lower margins due to inventory acquisition at expensive pricing to avoid stockouts. The forecasting system should not favor the objectives of the marketing department (keep shelves full) or the Finance dept (keep shelves empty to save capital) - by now the importance of unifying their forecast signals should have become apparent!

  • MPE (Mean Percent Error) – similar to MAPE but without the absolute value
  • Bias measured across time
  • Relative bias – moving average bias of model vs na?ve forecast – again the idea is to understand whether the specific item / location/ time is volatile by nature or whether the model is bad

3. Stability over time / low churn – this is important to measure so that the forecasts don’t change too often, especially when multiple models are used. This is critical for avoiding bullwhip effects in supply chain by reacting to every little outlier in the sales patterns causing extreme upstream instability. This is also critical for ensuring that FP&A, merch and supply chain are operating off similar enough forecasts that are not swinging wildly from one process to another. The businesses need to have a clear understanding of what is an acceptable churn level.

  • MAPC (Mean Absolute Percent Churn) : Absolute (1 - (forecast / last cycle forecast))
  • Relative Churn – once again we want to compare the churn of the model generated forecast to that of the na?ve forecast.

Forecasts should be evaluated at each level of generation for statistical properties. When there is an error in the system generated forecast, the system should alert planners or modelers. Additionally, the system should have a way to determine whether the overrides are effective longer term and issue alerts when it detects poor overrides.

Forecasts should be evaluated at each level of consumption before being fed to the business processes. It is very common for aggregate forecasts to behave very differently from granular (disaggregated) forecasts so the three properties listed above should be compared against each other using scatter plots to detect any patterns that often stand out to users with business intuition.

Forecasting systems should have automated tests that compare goodness across time periods. When metrics exceed thresholds, automated alerts should be issued for evaluation by demand planning teams as well as forecast analysts and modelers.

It must be noted that Error, Bias and Churn cannot all be controlled simultaneously – as such, it is a business decision as to which of these is more critical. The forecasting system should provide levers to specify the appropriate combination of preferences.

6?????????? What are the challenges in designing a good forecast?

There are several challenges in designing good forecasts:

  • Data quality (eg, when someone buys multiple similar items, are the cashiers swiping the same item multiple times therefore we lose SKU specific information?)
  • Data availability (eg is there enough data about hurricanes to train a hurricane model?)
  • Missing data – for eg, if an item is out of stock, do we know what fraction of the day it was out of stock so we can adjust the sales accordingly before training a model? Perhaps the out of stock adjustment might be just as complex as estimating forecasts - if so are we need to be careful not to introduce circularity in the model by feeding its own output as an input!
  • Market changes – this leads to non-stationarity in the models
  • Lead time variability across different business processes
  • Human biases especially when overriding forecasts
  • Subjective choices of customers leading to outliers in the data
  • Lack of ability to do statistical modeling and feature engineering appropriately
  • Excessive focus on forecast accuracy – for planning purposes the stability of forecasts over time is critical
  • Confirmation bias and overfitting - Excessive focus on difference between recent forecasts and actuals – this can easily lead to bullwhip effects when the noise in day-to-day time-series is misinterpreted as signal.
  • Driving convergence between different ML forecasts created by different ML model families – this is not easy as not all ML models are equally explainable!
  • It is not easy to pick “one best forecast”, this involves model blending which can be rule based or use another machine learning model (for eg, a reinforcement learning model)
  • In rare cases, it is impossible to design a good forecast systematically – in such cases, the prior cycle consensus forecast with human judgement should be elevated to be the best forecast available.
  • It might be challenging to connect an outcome (eg out of stocks) to the key driver of said outcome in the upstream processes (Eg, did we purchase too little 5 months back, or did a truck break down last week?)

7 ????????? References

Author’s personal experience, see for instance the papers at Foresight in 2022 or the talk at AI23 conference.

Conversations over the years with colleagues at Target and principals at specialized forecasting ML firms like Antuit, Ikigai.AI, Vyan.AI, etc.

How good are you at demand forecasting? Test your skill level here! https://www.iiom-web.org/demandforecast/play.php

回复
Ninad Khirwadkar

Senior Director, Supply Chain Intelligence

3 个月

Insightful and comprehensive, Subramanian "Subbu" Iyer. My 2 cents on the modernization of how. The auto scaling of compute capacity should be used to prevent the pattern corruption across categories. It’s about time we create specific & relevant intelligence for the 3 dimensions of forecasting- product, location and time. We need purpose built models for each of the use cases you have outlined. Again, great work on this paper. Can’t wait to partner and support!

Ram Chandra

Director of Data Science at Toyota | Data Science | AI | Machine Learning | Leadership | Strategy | Value Delivery | Product Management | HBS | IIT Kanpur

3 个月

Subramanian - This is a good read, going to save it :). I wonder if you are willing to share what is the range of good forecast metric you have experienced that has been valuable to business (section 5 of your article)

回复
Bhavna Sinha

?? Senior Data Scientist | AI & ML | Forecasting | NLP | Personalization | Causal Inference | Building Scalable Data Solutions

3 个月

This is a very knowledgeable.It’s so comprehensive and covers so much of details. Thank You for sharing it!

要查看或添加评论,请登录

Subramanian Iyer的更多文章

  • Attention! Translating transformers to plain English.

    Attention! Translating transformers to plain English.

    This article is a follow-up to my prior note on embeddings demystified, which was written in 2021 prior to the entire…

    3 条评论
  • What does a data scientist do?

    What does a data scientist do?

    I am frequently asked the question in the title by non-practioners of data science. (ML! AI!!) Sometimes, it is because…

    2 条评论
  • Embeddings explained in plain English

    Embeddings explained in plain English

    I keep getting asked questions about Machine Learning embeddings. Technical people seem to pretend this is magic.

    4 条评论
  • On deriving anecdotes from data

    On deriving anecdotes from data

    There are some things that are uniquely human. One must not try to get machines to do those things.

    5 条评论