About the authors
Stephen Soumerai is professor of population medicine and research methods at Harvard Medical School and the Harvard Pilgrim Health Care Institute.
Ross Koppel teaches research methods and statistics in the sociology department at the University of Pennsylvania and is a senior fellow at the Leonard Davis Institute of Health Economics.

Breathless reporting on badly designed studies can mean harmful policy choices. (iStock)

Journalists play essential roles in translating research on health and health policy for the public, government officials and even scientists. But with a proliferation of media outlets competing for readers’ attention and university press offices and academic journals seeking news headlines, accuracy often suffers.

As researchers who focus on health care, we see news coverage of badly designed studies constantly. And we’re concerned that breathless reporting on bad science can result in costly, ineffective and even harmful national policies.

Few journalists seem able to understand flawed research design, a principal cause of untrustworthy research.

Take the coverage earlier this year of federally sponsored “accountable care organizations” (ACOs): medical groups and hospitals that aim to improve health-care quality and reduce costs by rewarding physicians for staying under their assigned budget, charging them for exceeding it and paying them to meet performance goals, like ordering certain lab tests. Promising as they might sound, the best available evidence shows these systems don’t work. Despite this, the U.S. spends billions on such programs, and new, bipartisan national legislation (“MACRA”) will expand them even further.

Programs like these are propped up by poor studies that gain prominent media headlines. A case study in the extreme was reported by 63 newspapers, wire services, and TV and radio networks, all celebrating the “success” of a well-known Blue Cross-Blue Shield payment model similar to ACOs. But the Health Affairs study on which all of these exaggerated news stories were based has deep design flaws. To measure the impacts of the program, for example, the study simply compared physicians volunteering for it to those who did not. We have known for decades that physicians who volunteer for studies are the ones who are already meeting quality standards. In this study, the doctors participating in the program had higher quality ratings than non-participants before joining the payment program.

The worst violation of research design in the study was its lack of a baseline measure of health. This means we can’t know the participants’ health before the program, so measurement of “change” was impossible.

One account of the study, in the Boston Globe, did not report a key study conclusion — that the program failed to reduce health-care costs after many years. But it did exaggerate a tiny, clinically insignificant 1.2 percent bump in quality scores. It proclaims that the program actually improved health of the poor, while even the study’s authors state the exact opposite: that the policy failed to meet that goal.

The Globe headline claims that the “New system bolsters poor patients’ health.” The story then quotes “experts” who sang the program’s praises. Apart from the realism displayed by the study’s first author, Zirui Song, the accolades were overwhelming. A Blue Cross senior vice president (the program head) stated: “We saw those disparities close.” A Harvard faculty member and ACO expert, said, “This … certainly supports the use of alternative payment models.”

There are endless examples of bad research designs producing flawed findings, followed by uncritical media reports touting the results. Another untrustworthy study that was hyped by the media claimed that ambulances with better trained staff and more lifesaving equipment caused more deaths on the way to the hospital than minimally equipped ambulances with lesser trained staff. But the authors confused cause and effect: Ambulance dispatchers send the better-equipped ambulances to dying patients in an effort to save them before transporting them to the hospital. These patients are already more likely to die on the way to the hospital than patients in basic ambulances. They are five times as likely to have life-threatening conditions. Researchers can’t “control” for an assumption that violates basic logic.

Yet news outlets bought into the false claims, with headlines like: “Advanced life support ambulance transport increases mortality.” Such distortions have potentially life-threatening consequences to patients and policy. Another headline proclaimed, “Contrary to what most people might think, critically ill patients actually do better when transported in the more basic ambulances …” These and other stories seem to reproduce news releases, whose goal is to make headlines, not to accurately inform readers about science.

Another case involves the tens of thousands of biased studies of health information technology, including electronic health records. Conducted over decades, the studies supported a part of the 2008 economic stimulus package mandating that hospitals and doctors spend trillions of dollars on new systems (the HITECH law). But almost all the studies supporting such systems used the weakest designs — simply comparing death rates, for example, at hospitals using health information technology to those that don’t. The fatal flaw? The “high tech” hospitals also had younger and richer patients, more experienced doctors, higher volumes of procedures and experience and were more likely to be teaching hospitals, with extensive resources and expertise. Even more damning, all of the randomized controlled trial studies found that health IT had no effect on outcomes. But for many years, encouraged by health IT vendors, the media reported the false technological promises. Here is just one example of those false headlines reporting one of these biased studies: “Federal Investment in Electronic Health Records Likely to Reap Returns in Quality of Care, Study Finds.” In the same story, the first national coordinator of health information technology, David Blumenthal, is quoted: “These results support the expectation that federal support of electronic health records will generate quality returns on our investments.”

Today, scientific and professional opinion have acknowledged health IT’s drawbacks. They even admit to patient harm caused by software errors and to the failed promise of the national program. Most different electronic health record programs do not talk to each other, so the goal of a compatible (interoperable) and coordinated electronic health record system remains unattainable.

The path forward for science journalism will not be easy or obvious. The scientific process does not straightforwardly lend itself to reporting: Researchers may have weeks or months to structure their papers, which include complex statistical analyses and dense scientific jargon. Journalists often have only hours to convey the findings, and newspaper editors are generally not aware of scientists’ failure to acknowledge important limitations of their research — even fatal flaws that can debunk their studies. Moreover, the media aren’t the only uncritical interpreters of science. Even health experts routinely misunderstand research design or tout the findings of flawed studies. This is why journalists and scientists must work together to develop tools to better interpret the implications of research. Groups like the Association of Health Care Journalists could offer the structure and resources to rethink science and health reporting.

In the meantime, some specific reforms would help. The government must abandon the misleading research designs used in the studies of far too many health policies. Currently, the federal Centers for Medicare and Medicaid Services often handpick doctor and hospital volunteers, a kind of “all-star team,” and compares them to uninterested physicians who are destined to do more poorly, no matter how elegantly scientists analyze the biased data. We need more trustworthy studies of MACRA, the massive national quality of care policy that started in January. Randomized control trials — the gold standard for experimental research, which randomly assign doctors to interventions and controls — should be used to test any policy supported by significant public resources.

Health care already consumes almost a fifth of America’s GDP — two or three times every other nations’ costs despite our often inferior outcomes. It is vital that all stakeholders — journalists, scientists, policymakers, editors and the public — have a better understanding of basic research design and data interpretation. During this unusual moment in history, when politicians seek out whichever facts suit their ideology, the role of good science — and good reporting — could not be more vital.