登录查看更多内容

Measuring the value-added of algorithmic trading strategies

Ralph Sueppel

Managing Partner at Macrosynergy

发布日期: 2021年10月16日

Standard performance statistics are insufficient and potentially misleading for evaluating algorithmic trading strategies. Metrics based on prediction errors mistakenly assume that all errors matter equally. Metrics based on classification accuracy disregard the magnitudes of errors. And traditional performance ratios, such as Sharpe, Sortino, and Calmar are affected by factors outside the algorithm, such as asset class performance, and rely on the normal distribution of returns. Therefore, a new paper proposes a discriminant ratio (‘D-ratio’) that measures an algorithm's success in improving risk-adjusted returns versus a related buy-and-hold portfolio. Roughly speaking, the metric divides annual return by a value-at-risk metric that does not rely on normality and then divides it by a similar ratio for the buy-and-hold portfolio. The metric can be decomposed into the contributions of return enhancement and risk reduction.

For full post and references to the underlying paper please view the?Systemic Risk and Systematic Value site.

Popular algorithm performance metrics

“We reviewed 190 articles presenting either several ML and DL algorithms aiming at predicting future asset returns or RL algorithms proposing investment strategies. The performance metrics found in the analysed articles are very diverse…

Error-based metrics?estimate the performance of an algorithm in?measuring the error in prediction between the effective return computed ex-post and the value predicted?by the algorithm. These metrics include mean squared error (MSE), mean absolute error (MAE) and evolutions thereof…

Accuracy-based metrics?measure the?accuracy of the class assigned by the algorithm to the predicted return compared to the class of the effective return computed ex-post. The classification can be binary with two classes (positive expected return vs negative expected return, or investment vs no investment) or more complex…These metrics are based on confusion matrices…and include…accuracy, F1, precision or recall.

Investment-based metrics?measure the results derived from an investment strategy proposed by the algorithm with buy-hold-sell signals. These metrics can be subdivided into [two types].

Result-based metrics?measure either the monetary results, the realized return or the risk supported to generate the return (volatility, maximum drawdown, etc.) but do not adjust one by the other.
Risk-adjusted return-based metrics…also referred to as risk/return-based metrics consider simultaneously the return and the risk of the investment strategy and measure how efficient the algorithm is to generate a return under the constraint of risk and to optimize the risk/return profile. Metrics primarily differ by the way they assess the risk. This class of metrics includes Sharpe, Sortino or Calmar ratios…”

Why popular performance metrics are misleading

“Error-based metrics are among the most popular ones with 187 occurrences in the 190 reviewed articles. Error-based metrics are used in any domain as soon as regressions are involved, but for the specific task considered,?error-based metrics suffer from two severe weaknesses:

While they can easily be applied to regression algorithms, they are less applicable with classification algorithms and are inapplicable with reinforcement learning,?making a comparison between several types of algorithms impossible…
Error-based metrics will?equally consider all errors?and will not differentiate an error that triggers a bad decision (a mis-investment resulting in a negative return or a missed opportunity with no investment when the asset has led to a positive return) from an error that has no adverse consequence, leading to a positive return or to a non-investment that avoided a negative return…All errors are not equal; error-based algorithms do miss this critical element. Error-based metrics could lead to severe misevaluation of the performance of algorithms.”

“Accuracy-based metrics…focus on a different criterion: the right or wrong classification or the right or wrong investment decision. But?accuracy-based metrics might miss the magnitude of the relative gain from a good decision versus the magnitude of a loss from a bad decision.”

Insights from an empirical exercise

“We?prove the inefficiency of the error-based and accuracy-based metrics …We apply several AI regression algorithms: (i) multi-layer perceptron (MLP), (ii) Long Short-Term Memory neural networks (LSTM), (iii) residual neural networks (ResNet), (iv) Support Vector Machine (SVM) and (v) a decision tree-based algorithm “eXtreme Gradient Boosting” (XGB) to 28 stocks of the Dow Jones. We use different hyper-parameters with each algorithm to generate 980 series of daily returns. We?use 20 years history of daily prices: 15 years are used to train our algorithms and 5 years (1260 days) for testing as out-of-sample data.”

“We compute the MSE, RMSE, MAE (mean absolute error) and MAPE (mean absolute percentage error) of the regressions. We benchmark each of the 980 series with the ‘back-trading’ of a perfectly informed agent that invests when the return is positive and doesn’t invest when the return is negative or zero. We compute R, R2, accuracy, F1, precision & recall and Matthew’s correlation coefficient.”

“We apply the following investment strategy: if the predicted return of the next day is positive, we invest for one day, otherwise we take no open position. In each case, the model integrates direct transactions costs13 of 0.10% per transaction applied to the value of the transaction. From that investment strategy and assuming a risk-free rate at 0.0%, we compute the annual return (RoI), the volatility (Vol), the yearly maximum drawdown (MDD) in percentage of the investment and the Sharpe, Sortino and Calmar ratios.

“With the error-based metrics, we expect a negative correlation with the return, Sharpe, Sortino and Calmar ratios: the lower the error, the better the expected result. In italic, the metrics that are positively correlated.?Against expectations for efficient metrics, correlations are positive, except between MAPE and the risk/return performance metrics, but not significantly different from 0 at 5% significance, as illustrated with the p-values. MAPE is the only metric whose correlation is negative and significantly so.”

领英推荐

Do You Know What Your Algo Trading is Doing?

Peter Askechov 1 年前

Unraveling the Intricacies of Algorithmic Trading

Towfik Alrazihi 1 年前

Deriv Review: Targeting the Kenyan Market

Evans N. 8 个月前

The issues with Sharpe and Sortino ratios

Sharpe and Sortino ratios suffer from two important issues…

The Sharpe ratio quantifies risk using the standard deviation of excess returns, and Sortino by using the standard deviation of the negative excess returns. They?assume that returns are normally distributed, with no skewness and a kurtosis around 3.?If a portfolio’s return does not follow a Gaussian distribution, then the classical return volatility is no longer an effective measure of risk, and these ratios could underestimate the risk…
[Sharpe and Sortino ratios]?do not allow the performance of different algorithms to be compared over different assets or over different time periods. The results of Sharpe and Sortino are influenced by the return of the underlying asset.

Proposal for an algorithm performance metrics

“The objective of…[trading] algorithms…is to optimize the expected return of investments under the constraint of the risks generated by the investment. Our analysis will therefore?focus on the ability of metrics to provide a good proxy for the ability of an algorithm to achieve the objective of improving the risk-adjusted return.”

“We propose a new performance metric that improves the risk measurement and which has the ability to compare the efficiency of algorithms over time and across assets.

The use of?Cornish Fisher Value-at-risk?significantly?improves the measure of risk, compared to the volatility of returns…Value-at-risk (VaR)…offers a way to address skewness and kurtosis of the asset returns distribution with Cornish Fisher expansion (CF expansion). CF expansion accounts for the 4 moments of the distribution: the return, the volatility, the skewness and the kurtosis. It offers an easily implementable parametric form that improves risk measurement. Cornish-Fisher VaR (CF-VaR) is an effective and easy-to-implement approach to dealing with non- Gaussian distributions…There is no easy and parametric way to improve the risk measurement [beyond] CF-VaR.
If we combine the asset return with the CF-VaR, we can easily?define a return-to-VaR ratio equal to return divided by CF-VaR. This ratio outperforms Sharpe, Sortino or Calmar ratios as it better captures the effective risk accepted to generate the effective return.
We propose to define a new risk-adjusted return ratio as ‘Discriminant ratio’ or ‘D-ratio’ (D), which solely focuses on the added value of the algorithm. To achieve this objective,?we divide the Return-to-VaR ratio of the algorithm by the Return-to-VaR ratio of the Buy & Hold?If the D-ratio is greater than 1, the algorithm overperforms the Buy & Hold strategy; if the D-ratio is smaller than 1, it underperforms the buy& hold strategy.
To adequately address the situation where the return of the buy & hold and the return of the algorithm are of opposite sign…We propose to?correct our D-ratio for the difference in sign by?[subtracting the risk-adjusted return ratio of buy and hold strategy from the algo strategy and dividing the difference by the absolute value of the risk-adjusted return ratio of buy and hold strategy].

The overall formula is:

D-ratio = 1 + (R[algo] – R[B&H]) / Abs(R[B&H])

where

R[algo] is the tisk-adjusted return ratio of algo

R[B&H] is the risk-adjusted return ration of buy and hold

The D-ratio can be decomposed to assess whether the added value of the algorithm is more linked to the improved expected return or to the risk reduction ability.

D-ratio = D-return ratio * D-VaR ratio

where

D-return ratio = D-ratio / D-VaR ratio

D-VaR ratio = CF-VaR[B&H] / CF-VaR[algo]

[The] D-return ratio evaluates the ability of the algorithm to increase the expected return. If D-return is above 1, the algorithm outperforms the buy & hold strategy for its expected return. Otherwise, the Buy & Hold strategy is return-wise more efficient than the algorithm.

If D-VaR is above 1, the algorithms outperforms the buy & hold strategy for its risk management, as the CF-VaR of the Buy & Hold is greater than the CF-VaR of the algorithm.

Niels D. Cariou Kotlarek

PhD Fellow in Quantitative ML @ UCL - 2026 | Quant Researcher

3 年

Niels Escarfail would be a way to tackle the issue discussed earlier.

2 次回应

Alessandro Ricci, PhD

Quantum Physicist | DXT Commodities SA

3 年

interesting ...

查看更多评论

要查看或添加评论，请登录

Ralph Sueppel的更多文章

Tracking systematic default risk

2023年12月30日

Tracking systematic default risk

Systematic default risk is the probability of a critical share of the corporate sector defaulting simultaneously. It…

3 条评论
Optimizing macro trading signals – A practical introduction

2023年12月16日

Optimizing macro trading signals – A practical introduction

Based on theory and empirical evidence, point-in-time indicators of macroeconomic trends and states are strong…
Commodity carry as a trading signal – part 2

2023年11月29日

Commodity carry as a trading signal – part 2

Carry on commodity futures contains information on implicit subsidies, such as convenience yields and hedging premia…
Commodity carry as a trading signal – part 1

2023年11月18日

Commodity carry as a trading signal – part 1

Commodity futures carry is the annualized return that would arise if all prices remained unchanged. It reflects storage…
Sovereign debt sustainability and CDS returns

2023年11月4日

Sovereign debt sustainability and CDS returns

Selling protection through credit default swaps is akin to writing put options on sovereign default. Together with…

3 条评论
Macro demand-based rates strategies

2023年10月21日

Macro demand-based rates strategies

The pace of aggregate demand in the macroeconomy exerts pressure on interest rates. In credible inflation targeting…
How to measure the quality of a trading signal

2023年10月7日

How to measure the quality of a trading signal

The quality of a trading signal depends on its ability to predict future target returns and to generate material…

3 条评论
The predictive power of real government bond yields

2023年9月23日

The predictive power of real government bond yields

Real government bond yields are indicators of standard market risk premia and implicit subsidies. They can be estimated…
Equity versus fixed income: the predictive power of bank surveys

2023年9月9日

Equity versus fixed income: the predictive power of bank surveys

Bank lending surveys help predict the relative performance of equity and duration positions. Signals of strengthening…
Business sentiment and commodity future returns

2023年8月26日

Business sentiment and commodity future returns

Business sentiment is a key driver of inventory dynamics in global industry and, therefore, a powerful indicator of…

1 条评论

See all articles

社区洞察

Technical Analysis

How do you minimize algorithmic trading risks and costs?

Measuring the value-added of algorithmic trading strategies

Ralph Sueppel

Managing Partner at Macrosynergy

Popular algorithm performance metrics

Why popular performance metrics are misleading

Insights from an empirical exercise

领英推荐

The issues with Sharpe and Sortino ratios

Proposal for an algorithm performance metrics

Ralph Sueppel的更多文章

社区洞察

其他会员也浏览了

Decoding Algorithmic Trading: Understanding its Mechanisms, Pros, and Cons

HOW TO USE ALGORITHMIC TRADING TO YOUR ADVANTAGE

Trading Futures: Platform Providers, Strategic Insights, and ETP Comparisons

How to Become a Successful Smart Prop Trader in 2024-25

Coding for Profit - Algorithmic Trading Strategies and Implementation

Algorithmic Trading: A Comprehensive Overview

Algorithmic and Quantitative Trading Strategies

Algorithmic Trading

Can AI Help To Develop Profitable Trading Systems?

Exploring the Benefits of Software for Algorithmic Trading

Popular algorithm performance metrics

Why popular performance metrics are misleading

Insights from an empirical exercise

领英推荐

The issues with Sharpe and Sortino ratios

Proposal for an algorithm performance metrics

Ralph Sueppel的更多文章

Tracking systematic default risk

Optimizing macro trading signals – A practical introduction

Commodity carry as a trading signal – part 2

Commodity carry as a trading signal – part 1

Sovereign debt sustainability and CDS returns

Macro demand-based rates strategies

How to measure the quality of a trading signal

The predictive power of real government bond yields

Equity versus fixed income: the predictive power of bank surveys

Business sentiment and commodity future returns

社区洞察

其他会员也浏览了

Decoding Algorithmic Trading: Understanding its Mechanisms, Pros, and Cons

HOW TO USE ALGORITHMIC TRADING TO YOUR ADVANTAGE

Trading Futures: Platform Providers, Strategic Insights, and ETP Comparisons

How to Become a Successful Smart Prop Trader in 2024-25

Coding for Profit - Algorithmic Trading Strategies and Implementation

Algorithmic Trading: A Comprehensive Overview

Algorithmic and Quantitative Trading Strategies

Algorithmic Trading

Can AI Help To Develop Profitable Trading Systems?

Exploring the Benefits of Software for Algorithmic Trading