By David A. Shaywitz
Special to The Washington Post
Tuesday, April 22, 2008
Be careful what you wish for. That is the unexpected lesson of the past decade of biomedical research, which has been characterized by an overwhelming abundance of interesting things to study and powerful ways to study them.
A pioneer of this era, MIT geneticist Eric Lander, speaks eloquently of the "global view of biology," meaning that scientists now have extraordinary tools to study not only individual genes, but also multiple genes at the same time. Rather than immediately investing all their resources in a few favorite genes (the traditional approach), modern researchers first can survey thousands of initial candidates, then identify and ultimately direct their attention to the most important players and pivotal networks.
But we are increasingly discovering that this global perspective comes at an unexpectedly steep price: We're making a lot more mistakes. Or, at least, we seem to be having a lot of trouble picking out the rare, meaningful signal from the deafening noise in the background.
Typically, scientists accept a result as significant if there is a 95 percent chance it is real rather than random. But the catch is that as you start to make a large number of comparisons by examining thousands of genes, the possibility of a result appearing by chance becomes progressively more likely, to the point where such false positive results are all but guaranteed.
The consequence has been a boon for scientists -- most experiments yield enticing (read: publishable) results -- but a bane (of sorts) for science.
Scientific journals are littered with studies reporting "disease genes" or "molecular signatures" that are likely red herrings. To make matters worse, these results are typically packaged together as a tidy narrative, a post-hoc rationalization explaining how the newly identified genes fit perfectly into the biological process under investigation.
A recent review of 85 published genetic mutations proposed to be associated with heart attacks demonstrated a validation rate of zero: There was insufficient evidence to suggest that any of the originally published associations reflected more than chance alone.
Some genetic researchers (including Lander, who trained as a mathematician) recognize these pitfalls and apply appropriately rigorous statistical correction, leading to more-durable results that have been independently replicated and verified (the gold standard of science). But other investigators, particularly in other areas of biology, are still lagging.
The problem of errors associated with multiple comparisons turns out to afflict much of contemporary science: the study of thousands of proteins or metabolic factors, even the study of brain activity using MRI. Compare enough discrete brain regions, and differences are bound to emerge.
Sexy? Yes. Publishable? Often. Valid? Not necessarily.
The multiple-comparison problem may also be afflicting science as a whole.
There are now more scientists in the world than ever before, publishing more papers in more journals. Many scientists work on related problems, and it is likely -- inevitable, even -- that similar experiments are being done in many labs at the same time. Thus, even if the odds of an individual false positive result are only one in 20, the probability rockets upward as more researchers do the same study.
Now, you might think that multiple researchers pursuing a common question would provide a safety net of sorts, a healthy counterbalance to the problem of false positives, but often this isn't the case. Scientists (and scientific journals) have little appetite for negative results, and this pervasive bias effectively buries much of this important information.
While the truth might eventually emerge as researchers meet at conferences and engage in informal discussions, erroneous results are almost never retracted, even though recent estimates suggest that more than half of published results are not reproducible.
Of course, this tendency of random results to be reified by publication bias is not unique to science; the popular press routinely extols the latest hot hedge fund or brilliant money manager, rarely considering whether this success could simply be the result of chance.
In a slightly different way, the pharmaceutical industry (where I have experience) also has fallen victim to the seductive charms of overabundance. For years, drug companies worked on a relatively small number of targets; as a result, there was a large knowledge base around these targets and in-depth understanding of how they functioned in the body. After the genetic revolution, companies have found themselves awash in possible drug targets, many of which initially exist only as a DNA sequence and a database identification number.
According to a 2001 study by Lehman Bros., the average drug target was associated with only eight publications, down from several hundred a decade or so before. This shallow biology represents a significant challenge in the development of novel drugs (especially if, as expected, at least four of these eight reports are inaccurate or unreliable).
Equally challenging for the industry, emerging data suggest that many complex diseases (such as Type 2 diabetes) are likely to result from the integrated effect of hundreds or even thousands of subtle genetic variants, rather than from a single causative mutation, making the selection of suitable drug targets an even greater problem.
Before we throw the baby out with the bath water here, let's be clear about a few things: Scientifically, we are better off now than we were four years ago and much better off than we were four decades ago. The powerful new techniques of global biology have opened up captivating new vistas and permitted a level of analysis and insight our scientific predecessors could not have imagined. As we used to say in graduate school, "More is more."
But we also must come to terms with the unique challenges associated with these powerful modern approaches. Just as researchers must adopt rigorous new methods of statistical analysis and temper their enthusiasm with caution, all of us, as consumers of scientific information, must balance the hope we place in global biology with the skepticism this field has surely earned.
In the words of another era: Trust, but verify.
David A. Shaywitz, an endocrinologist and stem cell scientist by training, writes frequently about health and science. He is a management consultant in New Jersey. Comments:firstname.lastname@example.org.