Past performance, so they say, is no guide to the future. However, we don’t have a data set from the future, so we usually have to make do with using historical data. It’s not perfect, but it’s better than nothing.
The aims of quantitative analysis are simultaneously to gain advantageous investment returns and to avoid losses. Most investors prefer a relatively smooth ride, and no nasty surprises.
With a good understanding of the limitations of historical data, we can learn a lot from it. But most humans have a cognitive bias to expect a continuation of current trends. In very simplistic terms, people who only have experience of a particular investment going up will discount the possibility of it going down.
So a historical return pattern that meets our requirements is useful to demonstrate that a manager or investment has the potential to perform, but by itself, is not sufficient to persuade us that it WILL perform.
After all, it is far too easy to fake investment prowess with cherry picking or just plain luck. So peer comparisons are vital to discover if, given the particular strategy and market, the manager performed well, or at least better than his peers.
What is enough data? Sometimes this is not obvious. For a diversified short-term hedged equity manager, getting the bet right 55% of the time over a year might be enough, because he might have placed hundreds or thousands of trades, and the average represents a large data set. Conversely a Macro manager might only have one theme in a year, and he would need to have a much higher hit rate to avoid volatile returns.
Data analysis can provide important clues to how, when and why managers placed trades. Did he have the right positions for the wrong reasons? What are the real sensitivities of the portfolio – and was the manager open and honest in telling you what they are?
Is the manager concealing higher volatility with structural smoothing, or monthly reporting? If Value at Risk (VaR) tells us that 98% of the time, the investment will perform safely, what happens in the 2% that it doesn’t? Sometimes that data that isn’t there tells us as much as what is there.