Backtesting a Trading Strategy
We have spent much eff ort earlier convincing you that you should backtest every strategy that comes your way before trading it. Why would we recommend against backtesting some strategies? The fact is that there are some published strategies that are so obviously flawed it would be a waste of time to even consider them. Given what you know now about common pitfalls of backtesting, you are in a good position to judge whether you would want to backtest a strategy without even knowing the details. We will look at a few examples here.
Example 1: A strategy that has a backtest annualized return of 30 percent and a Sharpe ratio of 0.3, and a maximum drawdown duration of two years.
Very few traders (as opposed to “investors”) have the stomach for a strategy that remains “under water” for two years. The low Sharpe ratio coupled with the long drawdown duration indicates that the strategy is not consistent. The high average return may be just a fluke, and it is not likely to repeat itself when we start to trade the strategy live. Another way to say this is that the high return is likely the result of data-snooping bias, and the long drawdown duration will make it unlikely that the strategy will pass a cross validation test. Do not bother to backtest high return but low Sharpe ratio strategies. Also, do not bother to backtest strategies with a maximum drawdown duration longer than what you or your investors can possibly endure.
Example 2: A long-only crude oil futures strategy returned 20 percent in 2007, with a Sharpe ratio of 1.5.
A quick check of the total return of holding the front-month crude oil futures in 2007 reveals that it was 47 percent, with a Sharpe ratio of 1.7. Hence, this trading strategy is not in any way superior to a simple buy-and-hold strategy! Moral of the story: We must always choose the appropriate benchmark to measure a trading strategy against. The appropriate benchmark of a long-only strategy is the return of a buy-and-hold position the information ratio rather than the Sharpe ratio.
Example 3: A simple “buy-low-sell-high” strategy picks the 10 lowest priced stocks at the beginning of the year and holds them for a year. The backtest return in 2001 is 388 percent.
The first question that should come to mind upon reading this strategy is: Was the strategy backtested using a survivorship-bias-free stock database? In other words, does the stock database include those stocks that have since been delisted? If the database includes only stocks that have survived until today, then the strategy will most likely pick those lucky survivors that happened to be very cheap at the beginning of 2001. With the benefit of hindsight, the backtest can, of course, achieve a 388 percent return. In contrast, if the database includes delisted stocks, then the strategy will most likely pick those stocks to form the portfolio, resulting in almost 100 percent loss. This 100 percent loss would be the realized return if we had traded the strategy back in 2001, and the 388 percent return is an inflated backtest return that can never be realized. If the author did not specifically mention that the data used include delisted stocks, then we can assume the backtest suffers from survivorship bias and the return is likely to be inflated.
When You Should Avoid Backtesting a Trading Strategy?
Example 4: A neural net trading model that has about 100 nodes generates a backtest Sharpe ratio of 6.
My alarms always go off whenever I hear the term neural net trading model, not to mention one that has 100 nodes. All you need to know about the nodes in a neural net is that the number of parameters to be fitted with in-sample training data is proportional to the number of nodes. With at least 100 parameters, we can certainly fit the model to any time series we want and obtain a fantastic Sharpe ratio. Needless to say, it will have little or no predictive power going forward due to data-snooping bias.
Example 5: A high-frequency E-mini S&P 500 futures trading strategy has a backtest annual average return of 200 percent and a Sharpe ratio of 6. Its average holding period is 50 seconds.
Can we really backtest a high-frequency trading strategy? The performance of a high-frequency trading strategy depends on the order types used and the execution method in general. Furthermore, it depends crucially on the market microstructure. Even if we have historical data of the entire blog, the profit from a high-frequency strategy is still very dependent on the reactions of other market participants. One has to question whether there is a “Heisenberg uncertainty principle” at work: The act of placing or executing an order might alter the behavior of the other market participants. So be very skeptical of a so called backtest of a high frequency strategy.
Because life is too short to backtest every strategy we hear about, we believe that knowing the typical backtesting problems will help you choose which methods to backtest.
Read Also; Improve Your Backtesting Accuracy with Statistical Significance and Hypothesis Testing