Adding an extra model-validity ratio to your regression results.
Image source: https://bestmarketherald.com/boat-rental-market-key-players-are-airbnb-boatsetter-boatbureau-nautal-sailo-get-my-boat-click/

Adding an extra model-validity ratio to your regression results.

Keywords: Regression, performance, evaluation, Passive paralleling, risk, Sharpe, Treynor, Hit ratio, volatility, algorithms.

Determining whether your model was good due to luck or to skill is sometimes hard and a lot of statistical tests have to be done before saying that we "may" have something good (i.e. predictive on a constant and consistent manner). Data science and machine learning do not only deal with predicting variables using other variables, however, one of their main disciplines is prediction. In this article, we will present an intuitive yet very simple ratio that gives an idea on whether the forecasts given by the model are due to luck or to some sort or predictive power. The inputs of the model are however somewhat more than the average normal performance ratio.

Take for example the Sharpe ratio, it requires a risk-free rate, an average return (or final return) on the portfolio and the standard deviation of those returns. A ratio greater than 1 means that the excess return we have achieved was indeed worth-it, as the risks taken to achieve that return were minimal (i.e. less than the per unit return). Another example is the Treynor ratio which is the same as Sharpe but instead uses the beta and hence it actually measures the systematic risk as opposed to total risk (i.e. standard deviation). So, what is the Passive Paralleling ratio I'm presenting?

Intuition

Imagine the simple example of trying to predict random-like financial variables such as equity returns. You have a model you think is good but has yet to prove its worth. This model has the potential to work through different market regimes and thus you decide to test it over several periods and (for simplicity, we will select short periods) you choose to:

  • Train the model from 2000-2001 and test on 2002
  • Train the model from 2002-2003 and test on 2004
  • Train the model from 2004-2005 and test on 2006
  • Train the model from 2006-2007 and test on 2008
  • Train the model from 2008-2009 and test on 2010
  • Train the model from 2010-2011 and test on 2012
  • Train the model from 2012-2013 and test on 2014
  • Train the model from 2014-2015 and test on 2016
  • Train the model from 2016-2017 and test on 2018

For each of the test years, you want to check your hit ratio, i.e. whether your bets were correct more than 50% of the time or not (assuming daily long-short trade signals for each year). Now, it is extremely easy to get either 45% or 55% accuracy due to luck, but then again, we want a model that consistently gets it right at least more than 50%, after all if it didn't, we might as well toss a coin each day. Let's assume that we can benchmark our results to a passive strategy that buys on each trading day (100% long ratio) and compare the hit ratio we have from our model with the hit ratio of the passive strategy, and average our results over the testing periods.

We can take the average of the hit ratios we have gotten from our model and compare them to the average passive hit ratio that either goes long 100% or goes short 100% of the time.

Data necessary for calculation

The first step after having calculated the hit ratios (correct forecasts / total trades taken) from each strategy (long only, model, and short only) is to calculate their average over specified time periods, then comparing the strategy's hit ratio to long only, thus the average of the strategy's hit ratio to the average hit ratio of going long and the same thing is done to the short side.

Let us assume the following simulated results:

No alt text provided for this image

The model was correct on average 53.56% of the time while if we were long on every day, we would be correct 51% of the time (and 49% of the time in case of only short positions).

Formula

The formula below is not written in exact mathematical forms, but it should brief-up what we are trying to evaluate. A ratio greater than 1 indicates model superiority (a decent ratio is around 1.05 and really good ratio is around 1.10 - 1.15), while a ratio less than one indicates a bad model. Knowing that shorts' hit ratio is just 1 - long hit ratio will act as a deflator of the results (assuming no zero returns).

No alt text provided for this image

As said, intensive statistical tests and other techniques have to be made to check if the model is good or not, the Passive Paralleling ratio is a quick shortcut (although not that quick) and a sort of sanity-check for the model's predictive power on average which is also a gauge into the model's stability over time. It answers the following question:

Have we been consistently providing forecasts that add value or are we better-off going long or going short on all of our trades?


要查看或添加评论,请登录

Sofien Kaabar, CFA的更多文章

社区洞察

其他会员也浏览了