The Curious Case of using Correlation to measure dependency between variables
Boolean Algorithmic Trading
A proprietary quantitative investment management firm managing a hedge fund strategy using AI with statistical methods
Data science is science deployed in a heuristic manner. The manner in which it is deployed gives different answers to the same data set. One of the most popular yet misinterpreted methods are those of dependence.
Pearson and Spearman correlation coefficients are the most commonly used measures when measuring the relationship between variables. The Pearson correlation coefficient assesses the linear relationship between variables, while the monotonic relationship is measured by Spearman correlation coefficient.
In a linear relationship, the variables move in the same direction at a constant rate. Whereas, in a monotonic relationship, the variables tend to move in the same relative direction, but not necessarily at a constant rate. In a non-monotonic relationship, a function is not increasing on its entire domain or decreasing on its entire domain i.e. it increases or increases on different intervals of its domain.
Pearson’s product moment correlation coefficient and Spearman’s rank order correlation coefficient are controlled by small movements around the mean and thus fail to describe dependence between extreme events. The former only quantifies a particular form of dependency namely linear dependency and while the latter measures broader forms of dependency (i.e. non-linear), it still falls short of giving information about the structure of that relationship.
领英推荐
With a copula, however, you can separate the joint distribution into two contributions: the marginal distributions of each variable by itself, and the copula that combines these into a joint distribution.
In summary, while correlation coefficients measure the overall strength of the association, they give no information about how that varies across the distribution. They work best with normal distribution, but distributions in financial markets are most often non-normal in nature. For this purpose, the copula approach is a useful method for deriving joint distributions given the marginal distributions. Through the choice of copula, one can assess which parts of the distributions the variables are more strongly associated versus the other.
The copula, therefore, has applications in finance related to portfolio Value-at-Risk (VaR) to map extreme events represented by jump risk and default risks and option pricing to comprehend skewed or asymmetric distributions. It is useful as it can help identify spurious correlation observed in the data and in optimizing derivatives pricing models.