登录查看更多内容

Correlation, causation and vector autoregressions

Andrey Chirikhin

Structured Credit QA Lead at Barclays Investment Bank

发布日期: 2023年5月9日

Vector autoregression (VAR) should be your first go-to statistical model.

Assume you have observed a vector time series. It is better to immediately consider the vector case, with vector size larger than 2. Considering a single dimensional case does not allow making the key point - stratification of dependency ("correlation") and causation between the vector constituents. In the bivariate case, one is always tempted to rush into identifying independent and dependent elements of the pair, sinking into the swamp of regressions. You are not immune against it even in the general vector case, always at risk of being sucked in by peer gravity into the black hole of machine learning, with its equally premature feature engineering and other cool yet embarrassing stuff.

VAR is, abstractly, AR, only in the matrix/vector form:

One can immediately generalize into VAR(I)MA, by introducing another weighted sum over the lagged vector innovations. Such model is notoriously difficult to identify and estimate in practice. Hence we stick with VAR, in reality even with VAR(1).

There are at least two reasons one should consider such model. Firstly, the time series of X's is not a sample yet, i.e. order of X's may matter. One needs to prove that it doesn't. The easiest and practical way to prove it is just to estimate VAR. If estimated B matrices are not statistical zeros, then the chances are that the AR part of the model has pulled enough time dependence it and your innovations have no residual serial dependency, i.e. they are a sample and not a time series any more. You can do historical MC/bootstrap on the sample of innovations even without modelling them any further.

Fabio Ricci 1 年前

Ordinary Least Squares

Marcin Majka 2 个月前

How to Deal with Multicollinearity?

Mohammad Arshad 2 年前

Secondly, the key added value of VAR is stratification of causation and "correlation". The autoregressive B matrices capture causation in its purest form: the past affects the future. This is not the causation one is warned not to confuse correlation with: that causation is instantaneous. If need be, it can be somewhat introduced at this level via Error correction (ECM) dynamics, which is a bit more challenging to estimate:

Going back to basic VAR, once causation is taken out by the B matrices, the cross-component statistical dependency of the innovation vectors is what capture residual "correlation". That is the correlation not to be confused with causation. If there is no serial dependency in the innovations left, then you can sample from them directly, or you can estimate a parametric distribution from their sample, given that they are now i.i.d. This is helpful if you want to be able to sample a larger number of innovations than observed and you want a sample different from observations, of use some other parametric analytical methods.

As mentioned before, modelling vector innovations directly using the matrix version of MA may be too daunting. A half-way approach is to first deal with them component-wise before modelling the cross-component dependency if still necessary. In other words, once you have estimated VAR component-wise, you can analyse time series of the implied innovations for each component individually. This may be very useful if the components are diverse, e.g. equity returns vs credit spreads. You may need to use a GARCH-like approach to handle time clustering of variance in one component time-sieries and not in other components. Only after that you re-imply the innovation vectors and see if you need to still do anything with them.

Dependency via causation, modelled by the B matrices is "physics" of the model. It therefore may allow an interpretation far beyond the ability to partly explain variance. Furthermore, instead of the linear operators, which the B matrices are, you can invent whatever non-linear operator you like or attempt to estimate it non-parametrically using your favourite universal approximator. This was done in the past, and it is called Non-linear (V)AR. Tests have been invented (BDS) to check for non-linearity in time series. Whether all that adds material value in solving practical dynamic and optimal control problems in math finance is yet to be seen.

The chart in the head of the presentation shows PCA of the innovation correlation for the time series of iTraxx Crossover with different maturities. In the one of the left, the four AR(1) processes are estimated independently and then correlation between the residuals is PCA'ed. In the one on the right, a four-dimensional VAR(1) process is estimated and correlation between the residuals is also PCA'ed. VAR-based approach uncovers something that your traditional three yield curve factors do not.

Correlation, causation and vector autoregressions

Andrey Chirikhin

Structured Credit QA Lead at Barclays Investment Bank

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

Elastic Net Regression: Combining Both Ridge & Lasso

How to deal with Multicollinearity?

Kalman Filter: The first dive

Effective XGBoost by Matt Harrison

Unified Convergence Analysis of Nonconvex Randomized Block Coordinate Descent Methods

Q. How to choose the best-fit among various Statistical Models ?

SVR - Support Vector Regressor

Confidence Interval without Bayesian Stats

Look-ahead bias

A Tutorial on Ridge and Lasso Regression

领英推荐

NN-VAR-AEN

2023年10月21日

Balance vs entanglement (in life)

2022年1月26日

How not to tell truth with statistics

2021年1月29日

On pricing of death derivatives

2021年1月27日

BB Cs

2020年12月15日

Tensor Reloaded

2020年12月8日

The Vaccination Game

2020年11月28日

From CVA to EPE

2020年7月20日

The last temptation of Sarumans

2020年5月25日

Stats vs ML

2020年2月27日