An Epic Journey of Exponential Smoothing ( Part-1)
Ravi Prakash
Senior Manager , Planning and Business Systems , Johnson and Johnson , APAC , MedTech
Forecasting has always allured human beings. Priestesses of Delphi delivered prophecy in ancient Greece after being hallucinated by Ethelene gas . We have parrot fortune tellers still practicing it as a profession and means of livelihood. As we moved forward on our evolutionary journey , crippling complexities of life inspired us to make the art of forecasting more scientific so that we could cope up with uncertainties around us .
Around 70 years ago ( during 1950 and 1960s ) Professor Charles C. Holt (21 May 1921 – 13 December 2010) his student Peter Winters first introduced a new method of forecasting . Idea was very simple . When we consider yesterday's actual demand or observed value as Forecast for today (Naive Forecast ) then we are giving 100% weightage to latest value . The other extreme can be giving equal weightage to all past observations ( Mean or Average ) . Both gentleman traded a path which can fit itself between these extremes . Since then more than 30 variants of ETS have been proposed and explored . In this article we will trek along the same path to discover the magical power of this powerful method !
Data - XUV 700 sales data since 2022 . Although R is a great tool for statistical analysis but we will use Python ( I just love it ) . Article is heavily influenced by the book Forecasting Principles and Practice
By Rob J. Hyndman, George Athanasopoulos. Authors have probably given a best treatment to the topic of forecasting in form of this book . Approach taken is - 1) Application of method /model to generate forecast 2) Theoretical explanation of outcomes . It would be a three part article.
Above chart shows the last 23 months of sales data for XUV 700 . Do you see trend , seasonality etc. ? Is this time series stationary ? Be ready with your guess and we will revisit this query later in the article .
Simple Exponential Smoothing
This method works best for time series which are having no trend and seasonality but requires the value of level smoothing parameter alpha ( Also an estimated value of initial level as well ) . Can you guess it ? Since sales numbers are looking a lot dependent on immediate past values hence we would go ahead with alpha = 0.8 and we compare it with optimized value as calculated by algorithm . What is optimized here ?
If you are a production planner for XUV 700 , will you trust these numbers ? Why is that method is generating constant forecast of around 7600 units for Dec, Jan and Feb ? How did it arrive on this number ? What is initial level that algorithm has worked out ? As a planner it is important to know the answer of these queries in order to reject or adopt this forecast .
领英推荐
Theory ( Spoiler Alert )
Ah! This is the hard part but unfortunately unavoidable as it unravels the mystery behind forecast of 7600 units per month .
Above equations appear a little cryptic hence let us read it together . Forecast equation simply says that forecast for H period ( after time t) given the information till time t is always equal to value of level observed at time t. So even if we try to generate forecast for March 2024 , it would still recommend 7600 units in case of XUV 700. Smoothing equation is simple to read . Value of Level at any given time is function of observed value ( actual value ) and level at time t-1 . Are you feeling dizzy ? Let us further break it down.
Go back to Jan 2022 when we had first recorded the sales for XUV 700 and we wanted to generate forecast for Feb 2022 .
In order to do so we would need value of alpha and level ( L0) . We had no data to guess or calculate alpha then hence we could not have used this approach but we are in Nov 2023. We can 'randomly' assign some values to alpha ( within the constraints , 0< alpha <1) and also to level zero ( there is guideline for this also) . You may not have right value of alpha and level but you will get value of forecast for next period . You recursively repeat this exercise and then calculate Sum of squared error ( As shown below ) which is simply the difference between observed values and forecasted values for all observations . You select the values ( of parameters ) which offer minimum SSE! Very simple ! Try it in an excel sheet .
Sometimes you may not use entire data but split it into Train and test sets . Algorithm simply runs 'simulation' to get the optimum value of parameters ( just 2 in this case ) as there does not exist a solution like Linear regression .
Still not clear ! Open an excel sheet and try to manually calculate the different values as shown in table below -
The only caveat is that you should not try to find optimum value of alpha and level yourself . It is time consuming and guess can only give a starting point . Hopefully things are crystal clear now . Let us wrap it here . In next article we would look at advanced variants of ETS ! Till then Merry Charismas and Happy new year !
Sr Consultant at Capgemini Invent |APICS- CPIM
1 年As always very nice explanation. ??