Speeding up Forecasting with Excel
This time, I′m sharing a quick read on my journey developing my first Short Term Forecasting Tool with Microsoft Excel. This is intended for those people learning about forecasting, hoping anyone can find this useful.
The basic idea was very simple yet ambitious. I was spending too much time removing outliers. I was first taught to graph the weeks, compare them and normalize or remove any data points that were either too high or too low within the data set. If you look at the screenshot below, you′ll see the outliers highlighted and circled in red.
This process is fine in general because it works, but it really takes a long time. Imagine if you have to forecast for 50 or maybe more LOB′s.
At some point I was somewhere around on this situation and I thought: Well, there has to be an easier / faster way. Of course there was!
1st attempt: I created some formulas using Standard Deviation that would remove precisely the weeks that had outliers. Unfortunately, my formulas were removing the outliers and additional history so If I had 2 or 3 outliers on an 8-week data set at times I was actually left with 5 or even 4 data points and then the forecast would sometimes be too high or too low.
Therefore, and with some degree of disappointment I concluded this approach failed.
2nd attempt: I tried determining outliers by calculating the Standard Error of a Regression with Excel formulas =STEYX(Y,X) SLOPE(Y,X), INTERCEPT(Y,X) Initially I thought this was it, but when I tried to forecast for a lot of LOB′s I noticed that sometimes I was getting more outliers than I should have. This failed because when calculating the Intercept, it seems like randomly I would get a different value than the 1 that let me determine what data points were outliers. So, I would end up with 5 outliers on a 8 week data set. What??? Yes, 5 Outliers! Check the screenshot!
3rd and last attempt: (Until now at least) I tried removing outliers by using quartiles and the interquartile range to determine what data points where outliers based on the data set. See the steps next:
- 1- Calculate 1st Quartile. Formula =Quartile(YourDataRange,1)
- 2- Calculate 3rd Quartile. Formula =Quartile(YourDataRange,3)
- 3- Calculate IQR (Interquartile Range) Formula is =Quartile3-Quartile1
- 4- Calculate Lower Bound. Formula = Quartile1-(1.5*IQR)
- 5- Calculate Upper Bound. Formula = Quartile3+(1.5*IQR) These last 2 will get you a range based on your data points and anything below the Lower Bound or above the Upper Bound will be an outlier.
- 6- Now you just need a formula for each data point to compare against the Lower and Upper Bounds and remove them where applicable so that you can forecast. Check the screenshot.
Conclusion: Don′t feel bad if you are still graphing all data points to remove your outliers. You are not alone and that′s exactly why I′m sharing this brief article, for anyone learning that can benefit from my journey on removing outliers more quickly.
Thank you and please, feel free to comment / share your experiences too.
Disclaimer: The contents on this article express my opinion and not that of any company nor do I show data from any company.
Senior Manager - Global Workforce Planning and Analytics | Project Management I Six Sigma Black Belt l Outsourcing Workforce Management (WFM) Services | Driving Business Transformation and Optimization towards Excellence
3 年Excellent
Lead Data Analyst at the Australian Institute of Business
3 年The next step I've found in the process is to use an if(and) formula to confirm if the datapoint is in the quartile location. If it is, return the value, otherwise return #NA and let the graph reference the formulas. I only recently learnt you can let excel graphs ignore #NA values and smooth the graph out for you giving you a cleaner graph and a more accurate trendline value. WFM is still a learning space for me, but out of everything i'd say excel is the biggest opportunity
WFM = Customer Experience + Employee experience | WFM Adviseur a.i. @ Achmea
3 年Hey Daniel, you are actually providing very useful content for many people working on forecasting in excel. In many WFM tools they are far from what you really need, luckily there are exceptions!
I make my boss look good! | Customer experience and management polymath | Business engineer | Relentless efficiency monster and 360° collaborator | Trilingual - Fluent in tech, commercial and layman's
3 年Nice work and explanation Daniel... The question one always must ask when time series forecasting and ‘cleansing’ historical data is “is this an outlier or the start of a new trend?” One should always try to understand the drivers behind the ‘outlier’ and document for future reference any data ignored in any particular modelling. . Its not unheard of to eventually find a pattern to outliers.. Have a fun day...
Learning & Organization Development Specialist || Employee Development and Business Excellence Coach || Life Coach for Mental Health, Career Progression and Capability Development
3 年Deepak Parmani