登录查看更多内容

Practical Time Series Analysis Review

Willians Bernardino

Financial Controller | FP&A | PhD Lisbon University

发布日期: 2020年4月9日

SUNY - State University of New York

Many of us are "accidental" data analysts. We trained in the sciences, business, or engineering and then found ourselves confronted with data for which we have no formal analytical training. Professionals motivated by the constant search of profounder answers will enjoy the subjects and analysis proposed by Tural Sadigov & William Thistleton on the Practical Time Series Analysis Course - offered by The State University of New York - they turn into a pleasant journey, from top to bottom, reviewing the basics of statistics untill the most complex concepts without losing sense of humor.

The point in writing this article is to share knowledge and quality tools to be implemented on forecasting inside companies with time series. Despite my love by the topic and the importance on generating reliable measures while managing business information. Isn't hard to find critics on data driven metrics, the concepts behind a scientific approach demands much more caution and readings, and very often can contradict beliefs in the roots of managers.

To understand how we can correctly fit a Triple Exponential Smoothing, or a SARIMA model by hand AND by software will not change nobody's life, not withstanding brings a complementary view for table of the whom are analysing. To understand what is going on under the packages and assumptions, to permit to better choose a model, to better evaluate an answer, and because of it, better explain and forecast a fenomena.

The course navegate through the architecture of R Programming facilitated by simple examples and the Statistical Scientific Approach, introducing the development of matricial calculus and programming, on behalf of Yule-Walker Equations, which I believe is the key for implementation of time series forecasting methods presented.

The topic of each chapter have parallel explanations going on: one side, the scientifical logic that brings us the reasons why theories and methods exists and, the other side, a soft technical approach, not necessarilly ready to use, will require more readings. But for the ones that come up to learn and implement solutions on the daily basis the most amazing part of this course rise from the clear explanations about how we must interpret Noise, Innovations (new distortions on future values), Trends and Seasonalities just like a musical composition - multiple instruments that comes to play one song - existing innerent in the time series.

Topics of the course:

Data Visualization
Basic Statistics Review
Noise Versus Signal
Random Walk vs Purely Random Process
Time plots, Stationarity, ACV, ACF, Random Walk and MA processes
Stationarity
Series, Backward Shift Operator, Invertibility and Duality
AR(p) and the ACF
Difference equations and Yule-Walker equations
Partial Autocorrelation
Yule-Walker in matrix form and Yule-Walker estimation
AIC and model building
ARMA Processes
ARIMA and Q-statistic
SARIMA processes
Forecasting

A Practical Example

There is no point on studying such material without a practical approach and one of the most challenging parts of this Data Science View is to implement solutions. A solution can be a forecasting model that can have it's automated update. Nowadays we could take a statistical fitted model in a specific study and transform it in a Machine Learning Project that comprise the implementation of the model running live connected to an ERP, database. or systems of data management cloud based as AWS.

The importance of a good model is the correct approach to the problems, while it's a Time Series Analysis, we can think of variance controlling, neutralization of trends and seasonalities, the three systemic understandings components that one must master to correctly fit this models.

Variance controlling is the understanding if the variance on the time series is going up and down in different moments and the technical approach to make its regularization, like we will see a "Log-Return calculus", while constructing the model, because of the effect of a logaritmic approach that stabilizes the variation - homoschedasticity.

Neutralize the trend on a Time Series is basically understand if there is any systematic increase, or decrease, when time is evolving. If there is a trend, the technical method to be used is to calculate the differenciation over the dataset, which will bring the values to a central point from the beggining to the end of the Time Series.

Seasonality can be seen as trends that constantly repeat themselves with regular periodicity, quarterly, semestraly, annualy, it responds to a variation with specific seasons in a year, month, or week. The treatment given to the data have the addition of variables, specifically approaching this kind of variation, bringing inside the calculus the important innovations that remains every step we take forward through the time series.

Isn't my objective here present the theoreticals of the models that the course comprise, the course can be took by everyone. But, I want to show a piece of the possibilities that come up when aquiring this kind of knowledge.

Here I will evaluate a model to predict a Time Series with a SARIMA model - SEASONAL AUTOREGRESSIVE INTEGRATED MOVING AVERAGE Model - that comprise all the concepts explicited above. The dataset used was downloaded from the Time Series Data Library and the data comprise the sales from a souvenir shop in Australia.

CODE AND RUN

Read the data on R Studio or Jupyter Notebook:

SUV <- read.csv('monthly-sales-for-a-souvenir-sho.csv') # here it is in my notebook

Transform the dataset on a Time Series:

suv <- ts(SUV$Sales)

Call the analytical packages:

library(astsa)
library(forecast)

Create plots to start the components analysis:

par(mfrow=c(2,2))
plot(suv, main='Monthly sales for a souvenir shop', ylab='', col='blue', lwd=1)
plot(log(suv), main='Log-transform of sales', ylab='', col='red', lwd=1)
acf(suv, main = "ACF")
pacf(suv, main = "PACF")

OUTPUT:

N?o foi fornecido texto alternativo para esta imagem

On the Monthly Sales of the Souvenir Shop plot we can see that there is a clear increase in variance/variation, heteroschedasticity, of the peaks presented by time serie, for this purpose we will use the Log-Transformation to consistently adjust this variance. On the Log-Transform plot we still see that there is a strong trend going on, which must be treated.

On the ACF - Autocorrelation Function - we can analyse the interaction of Moving Averages that are intrinsecally in the not-treated data. Every spike in the plot we name a "lag" and the these lags will support our decisions about which model to be choose. The Moving Average capture the innovations extensions to future periods, disturbances, the information that is kept from a specific period to the future ones. As this waves drawn in the graph by the spykes we can assume that there is seasonality going on the time series, which seems to be of 12 periods and repeat - a great spyke at lag 1 and a great spyke at lag 12 fading slowly without abrupt cuts.

The PACF - Partial Autocorrelation Function - we analyse the Autoregressive characteristics present in the time series and the measure of the autocorrelation brings us the information kept from one time period to another, information that we want to model to be able to extract a good forecast, and information that we must measure if it exists, that will guide us to choose the best model.

Create plots to start the component analysis of a treated Souvenir TS:

par(mfrow=c(2,2))
plot(diff(log(suv)), main='Differenced Log-transorm of sales', ylab='', col='brown', lwd=1)
plot(diff(diff(log(suv)),12), main='Log-transform without trend and seasonaliy', ylab='', col='green', lwd=1)
acf(diff(diff(log(suv))), main = "ACF of Diff Log-transform")
pacf(diff(diff(log(suv))), main = "PACF of Diff Log-trnasform")

Let's talk about the Log-Transform without trend and seasonal effect plot, we're seeing a Time Series that have been treated with Log approach with two differenciations, which brought stability and better measure of variability - homoschedasticity, that affect us by the need of stationarity in the TS - absense of trend and seasonality to garantee that our model works as we need.

At the ACF we can test the Moving Average part of out model, built in two parts: Moving Average (MA) and Seasonal Moving Average (SMA). We can analyse the plot as a possibility of MA being 1, or 2, and a SMA = 1.

By the PACF we infer that our Autoregressive (AR) part can be 1, 2, 3, or 4, and our Seasonal Autoregressive (SMA) part as 1.

This numbers are the model building process and they follow the Parsimony Principle - we choose a model that isn't the necessarily the best, if the gain from simplicity is major than the gain in explainability. It means that we don't want to overfit out the TS, the model.

You can see that we haven't yet applied any statistical process of forecasting, the building process explicited above comprise a model called SARIMA, very powerful and with long time of experimentation by the scientific community. This approach is used on a variety of firms that depends much of the creativity and technical level of financial teams.

Defining Differencing components and periods:

d = 1
DD = 1
per = 12

Building a loop to analyse multiple models:

for(p in 1:2){ #AR component
for(q in 1:2){ #MA component
for(i in 1:2){ #Matricial calculus [i, ]
for(j in 1:4){ #Matricial calculus [ ,j]
if(p+d+q+i+DD+j<=10){ #model components (p, d, q, DD)
model<-arima(x=log(suv), order = c((p-1),d,(q-1)), seasonal = list(order=c((i-1),DD,(j-1)), period=per))
pval<-Box.test(model$residuals, lag=log(length(model$residuals))) #Test for residuals analysis
sse<-sum(model$residuals^2) # Sum Squared Errors Metric for quality of estimation
cat(p-1,d,q-1,i-1,DD,j-1,per, 'AIC=', model$aic, ' SSE=',sse,' p-VALUE=', pval$p.value,'\n')
}}}}}

OUTPUT:

0 1 0 0 1 0 12 AIC= -11.60664  SSE= 3.432906  p-VALUE= 0.0001365566 
0 1 0 0 1 1 12 AIC= -16.09179  SSE= 2.97756  p-VALUE= 3.149952e-05 
0 1 0 0 1 2 12 AIC= -17.58234  SSE= 2.301963  p-VALUE= 0.0002456591 
0 1 0 0 1 3 12 AIC= -16.41016  SSE= 2.35266  p-VALUE= 0.0003392283 
0 1 0 1 1 0 12 AIC= -13.43083  SSE= 3.214065  p-VALUE= 4.083839e-05 
0 1 0 1 1 1 12 AIC= -17.76362  SSE= 2.399746  p-VALUE= 0.0001916565 
0 1 0 1 1 2 12 AIC= -15.99095  SSE= 2.349898  p-VALUE= 0.0002477783 
0 1 0 1 1 3 12 AIC= -14.74777  SSE= 2.302026  p-VALUE= 0.0004504596 
0 1 1 0 1 0 12 AIC= -27.78538  SSE= 2.643277  p-VALUE= 0.1742478 
0 1 1 0 1 1 12 AIC= -34.54538  SSE= 2.233424  p-VALUE= 0.2730783 
0 1 1 0 1 2 12 AIC= -33.6145  SSE= 2.109473  p-VALUE= 0.2830597 
0 1 1 0 1 3 12 AIC= -32.19273  SSE= 1.87789  p-VALUE= 0.270042 
0 1 1 1 1 0 12 AIC= -32.33192  SSE= 2.360507  p-VALUE= 0.2584529 
0 1 1 1 1 1 12 AIC= -34.0881  SSE= 1.842013  p-VALUE= 0.2843225 
0 1 1 1 1 2 12 AIC= -32.1017  SSE= 1.856343  p-VALUE= 0.28516 
1 1 0 0 1 0 12 AIC= -27.07825  SSE= 2.6747  p-VALUE= 0.2297852 
1 1 0 0 1 1 12 AIC= -34.98918  SSE= 2.209442  p-VALUE= 0.4633806 
1 1 0 0 1 2 12 AIC= -33.38623  SSE= 2.159413  p-VALUE= 0.4515457 
1 1 0 0 1 3 12 AIC= -31.54519  SSE= 2.121635  p-VALUE= 0.4390829 
1 1 0 1 1 0 12 AIC= -32.64858  SSE= 2.340077  p-VALUE= 0.4022223 
1 1 0 1 1 1 12 AIC= -33.48894  SSE= 2.125764  p-VALUE= 0.4442665 
1 1 0 1 1 2 12 AIC= -31.52137  SSE= 2.093124  p-VALUE= 0.4463098 
1 1 1 0 1 0 12 AIC= -26.17089  SSE= 2.624281  p-VALUE= 0.2507442 
1 1 1 0 1 1 12 AIC= -33.30647  SSE= 2.201798  p-VALUE= 0.4110138 
1 1 1 0 1 2 12 AIC= -31.68924  SSE= 2.151774  p-VALUE= 0.3820812 
1 1 1 1 1 0 12 AIC= -31.10127  SSE= 2.323818  p-VALUE= 0.3492746 
1 1 1 1 1 1 12 AIC= -32.69913  SSE= 1.824189  p-VALUE= 0.3091943

Let's take a closer look at what we understand to be each column:

Components of the model (p, d, q, P, D, Q)s, where (p, d, q) are non-seasonal components and (P, D, Q)s are seasonal components;
AIC - AKAIKE Information Criteria : is a indicator of the quality of the model and penalizes too complicated models. AIC is our main criteria to model selection;
SSE - Sum Squared Error : when used to choose a model, what we want is the model with less error possible, without overfitting;
p-VALUE : we want a high p-value to confirm our hyphotesis of non-autocorrelation within the residuals.

Now that we know the parameters to choose our model let's see the result:

Model: SARIMA (1,1,0,0,1,1)12

1 1 0 0 1 1 12 AIC= -34.98918  SSE= 2.209442  p-VALUE= 0.4633806

Brought us the best result taken in consideration to the parsimony principle for the simplest model.

FIT SARIMA MODEL:

model<- arima(x=log(suv), order = c(1,1,0), seasonal = list(order=c(0,1,1), period=12))

PLOT THE FORECAST:

plot(forecast(model))

OUTPUT

OUTPUT THE VALUES:

forecast(model)

    Point Forecast     Lo 80     Hi 80     Lo 95     Hi 95
 85       9.600019  9.373968  9.826071  9.254303  9.945736
 86       9.786505  9.533944 10.039066  9.400246 10.172764
 87      10.329605 10.025423 10.633786  9.864399 10.794810
 88      10.081973  9.746705 10.417240  9.569225 10.594720
 89      10.008096  9.638604 10.377587  9.443007 10.573184
 90      10.181170  9.783094 10.579245  9.572365 10.789974
 91      10.439372 10.013362 10.865383  9.787845 11.090900
 92      10.534857 10.083237 10.986477  9.844164 11.225551
 93      10.613026 10.136886 11.089165  9.884833 11.341218
 94      10.664526 10.165207 11.163846  9.900883 11.428170
 95      11.096784 10.575248 11.618321 10.299163 11.894406
 96      11.877167 11.334355 12.419979 11.047007 12.707326
 97       9.932756  9.330373 10.535139  9.011491 10.854022
 98      10.112194  9.475681 10.748707  9.138731 11.085657
 99      10.658829  9.980844 11.336814  9.621940 11.695718
100      10.409423  9.696788 11.122058  9.319542 11.499304
101      10.336437  9.588666 11.084207  9.192821 11.480052
102      10.509064  9.728749 11.289378  9.315676 11.702452
103      10.767491  9.955449 11.579532  9.525580 12.009401
104      10.862863 10.020524 11.705202  9.574617 12.151109
105      10.941088 10.069390 11.812786  9.607940 12.274235
106      10.992560 10.092516 11.892605  9.616061 12.369060
107      11.424833 10.497280 12.352385 10.006263 12.843402
108      12.205208 11.250954 13.159461 10.745803 13.664613

The values forecasted by the model are divided by the probabilities of occurrence, first the red line represents the POINT FORECAST. Secondly, are shown the 80% probability result (Hi 80 ; Lo 80) represented by the darker grey area. Last, we have the 95% probability of ocurrence result (Hi 95 ; Lo 95). It is important to understand that the probability of achieving numbers as metrics of forecasting are the same of used in the Budgeting Process, which commonly is given in a fixed value, what is erroneous in a way of prediction analysis, where would be better predict the scenarios.

Fit model for residual analysis:

model <- sarima ( log(suv), 1,1,0,0,1,1,12 )

Call the residual analysis:

model$ttable

The image above shows the Standardized Residuals that looks like White Noise, the ACF don't show significance - don't cut the blue dashed line, the Q-Q Plot shows a slightly departure from normality on the tails of the residuals dispersion and on the Ljung-Box shows that no point have significantly small p-value. This means that our model is coherently built and have aleatory residuals.

Where Expectations can be Nurtured?

With this article I tried to share some knowledge and a way to acquire it. Isn't trivial build expertise on Forecasting methods and either is to add to the debate with reliable information. Companies are full of opinions and non-proved theories about what drives it's accomplishments and, I think, remains to the data scientists to analyse and search for evidences of such arguments, the denial of the hyphotesis testing.

While developing metrics and analysis, most companies need to automate processes and the operationalization of Budget and Forecasting isn't an easy task that requires a lot of expertise to achieve results, because while all of this models are being implemented by the firms, they are changing the way it works and the accumulation of this shared expertise through data driven critics tends to horizontaly be disseminated.

Here presented a SARIMA model that is a mixed model, the evolution of two different models Moving Averages and Autorressive models. Nowadays studies have been made about the use of these models simultaneously with other types and Machine Learning options, what states the reason I took the course at first place - to review the knowledge about the construction of such models. You can check the M-4 competition for studies of this kind and the best simulation models achieved. You can check also the blogs from Uber Engineering and Amazon Engineering for more practical examples. Inside these companies the parametric models, as used here, are being used for local forecasting, while Neural Networks and Machine Learning types are being used to glue and compile the results to a Macro model, a mixed model approach, achieving better accuracy.

There is an universe to explore and the implementation of forecasting models must be made by very detailed professionals to guarantee that the tests results in reliable information for the firms, multiplying the understanding of the business without black boxes. For this purpose this course have been a great use of my time. Ans as an professor once stated "test your models exhaustively, because once it's running is your neck they will pull if it fails".

Experimentation is the key for innovation, be creative and hungry for knowledge, let curiosity be your guide.

Cordially,

Willians Bernardino

Economist, Master Candidate in Economics and Computational Econometrics at UFRGS. Researcher for the e-Compfin at UFRGS, Computational Finance Group. FP&A Specialist - MBA, by FGV-Mngmnt. Works at thyssenkrupp Elevator Financial Control Team Latin America.

Chiranjib Kashyap Sarma

Developer || Building ML solutions and pretending It’s not magic

3 年

Thank you for pointing me to this course. And also for a great overview :-)

1 次回应

Olavo Guimar?es

Advogado na Mudrovitsch Advogados | Mestre em Economia (UFRGS)

4 年

Great work!

1 次回应

查看更多评论

要查看或添加评论，请登录

Willians Bernardino的更多文章

é possível prever os resultados futuros de empresas com modelos estatísticos? Uma aplica??o para Lojas Renner.

2023年5月29日

é possível prever os resultados futuros de empresas com modelos estatísticos? Uma aplica??o para Lojas Renner.

NOTA: Este é um exercício continuado para acompanhamento dos resultados auferidos em minha disserta??o de mestrado -…

5 条评论
THE FINANCE CODE: Analisando o Mercado Acionário com Modelo de Fator, Yahoo Finance e Python

2020年7月22日

THE FINANCE CODE: Analisando o Mercado Acionário com Modelo de Fator, Yahoo Finance e Python

Pandemia, política e bolsa de valores est?o hoje nos trending topics de qualquer meio de comunica??o, mas a bolsa de…

2 条评论
Google Colab, Drive e Python: Salve e rode seus códigos na nuvem

2020年7月17日

Google Colab, Drive e Python: Salve e rode seus códigos na nuvem

Equipe, Quem nunca perdeu mais horas instalando programas no notebook do que propriamente usando eles. Ent?o, o Google…
Contra a Maré - Brasil, Economia e Indústria de Automa??o

2018年10月4日

Contra a Maré - Brasil, Economia e Indústria de Automa??o

Cenário Político-Econ?mico e a Indústria de Automa??o A economia brasileira segue na expectativa de uma decolagem. Após…

Practical Time Series Analysis Review

Willians Bernardino

Financial Controller | FP&A | PhD Lisbon University

SUNY - State University of New York

A Practical Example

CODE AND RUN

Where Expectations can be Nurtured?

Willians Bernardino的更多文章

社区洞察

其他会员也浏览了

A Beginner's Guide to Data Science

The Data Scientist's Prayer: Finding Humour and Insight in the World of Data

Mastering the Top 10 Statistical Concepts: The Key to Success in Data Science

Navigating the World of Numbers: Demystifying Data Science

PRINCIPAL COMPONENT ANALYSIS - Simplifying Data with PCA

5 Lessons Data Scientists Can Learn from Crowd Forecasting Research

TIME SERIES FORECASTING APPROACH

Discovering 'Practical Statistics for Data Scientists'

How Data Science Project Works - From the Koobiyo Teledrama

SUNY - State University of New York

A Practical Example

CODE AND RUN

Where Expectations can be Nurtured?

Willians Bernardino的更多文章

é possível prever os resultados futuros de empresas com modelos estatísticos? Uma aplica??o para Lojas Renner.

THE FINANCE CODE: Analisando o Mercado Acionário com Modelo de Fator, Yahoo Finance e Python

Google Colab, Drive e Python: Salve e rode seus códigos na nuvem

Contra a Maré - Brasil, Economia e Indústria de Automa??o

社区洞察

其他会员也浏览了

A Beginner's Guide to Data Science

The Data Scientist's Prayer: Finding Humour and Insight in the World of Data

Mastering the Top 10 Statistical Concepts: The Key to Success in Data Science

Navigating the World of Numbers: Demystifying Data Science

PRINCIPAL COMPONENT ANALYSIS - Simplifying Data with PCA

5 Lessons Data Scientists Can Learn from Crowd Forecasting Research

TIME SERIES FORECASTING APPROACH

Discovering 'Practical Statistics for Data Scientists'

How Data Science Project Works - From the Koobiyo Teledrama