The doubts swirling around academic finance mirror a broader “replication crisis” in scientific research that dates back to the mid-2000s. If a study’s findings are valid, then other researchers should be able to duplicate them. In fields including psychology and medicine, it turns out that many papers don’t pass the test. The same scrutiny applied to financial research has been no less forgiving.
“Our field is not ‘special’ compared to other fields and does not warrant a free pass,” Campbell Harvey, a professor at Duke University and former editor of the top-tier Journal of Finance, wrote in a paper this August titled “Be Skeptical of Asset Management Research.” He concludes that half the empirical research findings in finance – numbering more than 400 supposedly market-beating factors – are false.
Harvey, who is also an adviser to asset-management companies including Research Affiliates LLC, pins the blame on the distorting power of incentives. Authors need to publish to be promoted, tenured and paid more. That encourages them to tweak their choices around data and methodology to produce eye-catching findings that pass the test of statistical significance — a practice known as “p-hacking” in the jargon of statisticians.
A result that is statistically significant isn’t necessarily meaningful or reliable. The p-value, with a common threshold for significance being 0.05, merely describes the probability of a result happening by chance. The more you test and tweak your variables, the greater the likelihood you will eventually produce a finding that looks convincing but is actually just a fluke. In a 2015 article, FiveThirtyEight carried an interactive graphic that vividly demonstrated the absurdity of the p-hacking business. The chart examines the question of whether the U.S. economy performs better when Republicans or Democrats are in power, using data going back to 1948. The joke is that readers can produce a statistically significant result (to a p-value of 0.05, enough for inclusion in an academic journal) showing that either hypothesis is correct, depending on which variables they select.
Academics can game the system, then, and indeed many of the practices that fall under the rubric of p-hacking qualify as research misconduct, according to Harvey. There may be a more benign explanation in at least some cases, though. It’s possible, and in fact quite likely, that finance researchers believe in their theories and advance them in good faith; they’re just misled by their fallible human brains.
We are hard-wired to see patterns, build stories and find causal relationships in our environment. There are strong evolutionary reasons for this. It’s a complex world out there, and humans are deluged with haphazard and indiscriminate information. Recognizing patterns from incomplete data is a way of simplifying our habitat, making it manageable — and sometimes staying alive.
Imagine a gazelle in the savanna. It hears a rustling in the grass, and flees. It’s only the wind. Inaccurate pattern recognition has caused it to mistake the sound for a cheetah. Still, this gazelle is going to survive longer than one that hasn’t learned to run when the grass rustles. The downside of seeing a pattern that isn’t there is drastically less than the consequence of not seeing a pattern that is there. Harvey himself is hardly indifferent to the evolutionary argument (the gazelle example comes from one of his keynote speeches). He uses it as an example of why finance research turns up so many factors that don’t work: We have an evolutionary disposition toward errors that are false positive (Type I in the statistics jargon), rather than false negative (Type II).
Nassim Nicholas Taleb wrote a whole book about the phenomenon. In “Fooled by Randomness: The Hidden Role of Chance in Life and in the Markets,” he tartly criticized the tendency of financial journalists to offer causal explanations for market moves that were in essence simply noise. While statistically robust, Taleb’s critique lacks a practical appreciation of the evolutionary imperatives facing real-time financial news reporters, who might not have jobs for long if all they can write is that the market made an insignificant move for no reason.
Human beings tell stories because, among other reasons, they are an effective way to simplify the world and share knowledge within our ultra-social species. Any narrative constructed under time pressure is inevitably an approximation using the best knowledge currently available, and can be expected to be refined as more becomes known. But if the first drafters of history deserve some slack, then surely academic researchers who are subject to far greater rigor have no such excuse?
Not really. There may be a crisis in the quality of academic finance research (Harvey’s contention is disputed by some) but this is not demonstrated unequivocally by the fact that many findings fail to hold up. Science doesn’t proceed by establishing unchallengeable truths and certainties, but by repeatedly overturning itself. Any hypothesis is provisional, accepted as correct only until a better one comes along with superior explanatory and predictive power. As the physicist Richard Feynman said: “We are trying to prove ourselves wrong as quickly as possible, because only in that way can we find progress.”
In that light, the revelation that so many finance research findings are phony should be cause for celebration. We’re now, as a result, smarter about markets than we were before they were disproved. Consumers of studies on how to beat the market have a powerful interest in advancing this process. If a strategy doesn’t work, they’re going to lose money. As incentives go, that’s a big one. In the meantime, Harvey’s advice on asset management research is probably sound. Be skeptical.
This column does not necessarily reflect the opinion of the editorial board or Bloomberg LP and its owners.
More stories like this are available on bloomberg.com/opinion
©2021 Bloomberg L.P.