The Washington PostDemocracy Dies in Darkness

Opinion Judges are terrible at distinguishing good science from bad. It’s time we stopped asking them to do it.

(Getty Images/iStockphoto)

If you’ve been reading The Watch for a while, the headline above won’t surprise you. But a new law review article by Paul Giannelli, a professor emeritus at Case Western University Law School, shows just how terrible the courts have really been on the issue.

Giannelli, who served on President Barack Obama’s now-disbanded National Commission on Forensic Science, looks at how six forensic fields for which there is little to no supporting scientific research (or in some cases, that scientific research has discredited) — bite-mark comparison, arson, microscopic hair analysis, firearms and toolmark analysis, fingerprint analysis, comparative bullet-lead analysis. These fields vary in scientific credibility and probative value from little to none (bite-mark comparison and bullet-lead analysis) to possibly valuable, though the extent of which is still unproven (fingerprint analysis).

Giannelli’s article sums up the dearth of scientific research in each of these fields, then comments on how the courts have handled challenges to their use in criminal trials. We’ve discussed the story of bite-mark evidence here on numerous occasions. It is arguably the least scientifically credible field of forensics still used, and yet to this day, not a single court in the United States has upheld a challenge to keep it out of evidence. But the courts haven’t done much better with the other fields.

Here’s Giannelli on hair and fiber analysis:

In 1995, a federal district court in Williamson v. Reynolds observed: “Although the hair expert may have followed procedures accepted in the community of hair experts, the human hair comparison results in this case were, nonetheless, scientifically unreliable.”77 The court also noted that the “expert did not explain which of the ‘approximately’ 25 characteristics were consistent, any standards for determining whether the samples were consistent, how many persons could be expected to share this same combination of characteristics, or how he arrived at his conclusions.”
Williamson, who was five days from execution when he obtained habeas relief, was subsequently exonerated by DNA testing. 79 The Williamson opinion — perhaps the only thorough judicial analysis of microscopic hair comparisons — was all but ignored by other courts. In Johnson v. Commonwealth (1999), the Kentucky Supreme Court upheld the admissibility of hair evidence, taking “judicial notice” of its reliability and thus implicitly finding its validity indisputable. Other courts echoed Johnson, not Williamson. Indeed, ten years after Williamson was decided, a 2005 decision by the Connecticut Supreme Court observed (correctly) that “[t]he overwhelming majority of courts have deemed such evidence admissible.”

It was only a few years ago that the FBI finally admitted that its hair and fiber analysts had overstated the certainty of their claims in virtually every case in which they had testified. Those analysts also trained countless other state and local analysts across the country in the same methods.

On ballistics and toolmark analysis, Giannelli notes that in 2005, two federal courts pointed out in the cases of U.S. v. Green and U.S. vs. Montero that these fields are entirely subjective. There are no error rates or consistent standards or criteria for determining that only a certain gun could have fired a certain bullet or that only one particular screwdriver could have made the pry marks on a door. He writes that one of the courts . . .

. . . concluded that the theory on which the expert relied was “tautological.” The Association of Firearm and Toolmark Examiners (AFTE), the leading organization of examiners, proposed the theory. Under this theory, the examiner may declare an identification if (1) there is “sufficient agreement” of marks between the crime scene and test bullets and (2) there is “sufficient agreement” when the examiner says there is.

So did other courts follow suit? You probably know the answer. Most courts continued to allow this testimony into evidence. Some courts at least recognized the problem, but their solutions were largely meaningless. Again from Giannelli:

Other courts took an important, but still limited, step of restricting examiner testimony by precluding the expert from making gross overstatements such as declaring a match to the exclusion, either practical or absolute, of all other weapons. Similarly, some courts forbade experts from testifying that they hold their opinions to a “reasonable degree of scientific certitude.” That term has long been required by courts in many jurisdictions for the admission of expert testimony. Incredibly, the phrase has no scientific meaning and the claim of certainty is unsupported by empirical research. Thus, it is grossly misleading. Indeed, the National Commission on Forensic Science rejected it. Still other courts went off on a quixotic tangent, substituting the phrase “reasonable degree of ballistic” certitude. Changing “scientific certainty” to “ballistic certainty” merely underscores the courts’ scientific incompetence.
However, even these modest limitations were rejected by other courts.136 For example, in United States v. Casey, 137 the district court declined “to follow sister courts who have limited expert testimony based upon the 2008 and 2009 NAS reports and, instead, remains faithful to the long-standing tradition of allowing the unfettered testimony of qualified ballistics experts.”

The story is similar for each of the remaining fields. The excerpt below, which is also about toolmark identification, could just as easily apply to almost any field of forensics. The core elements are the same: You have an old guard of forensic examiners who aggressively defend their methods. You have scientists imploring that what the old guard is claiming is being presented to jurors as science despite no scientific research to support their methods and analysis. And you have the judiciary straggling behind, with perhaps one judge here or there willing to speak out, but for the most part blindly devoted to precedent and the concept of “finality” and therefore — absent DNA evidence — unwilling to entertain the possibility that, for all these years, the courtroom’s guardians of science might have gotten it wrong.

For years, an entrenched forensic discipline vigorously guarded its turf by rejecting the conclusions of the outside scientific community. It published a journal which was “peer-reviewed” by other members of its discipline. The journal, which is advertised as “the Scientific Journal” of AFTE, was not generally available until 2016. The discipline claimed to be a “science” but did not hold itself to the normative standards of science. The AFTE “Theory of Identification” is “clearly not a scientific theory, which the National Academy of Sciences has defined as ‘a comprehensive explanation of some aspect of nature that is supported by a vast body of evidence. . . . .’ More importantly, the stated method is circular.”
Only recently, after two NAS reports, have some courts begun to limit misleading testimony. Many have not. Thus, the courts’ competence to deal with flawed research remains extant. The one bright spot came in Williams v. United States, 147 in which Judge Easterly wrote in a concurring opinion: “As matters currently stand, a certainty statement regarding toolmark pattern matching has the same probative value as the vision of a psychic: it reflects nothing more than the individual’s foundationless faith in what he believes to be true.”

Even bad arson science — which has been widely debunked and dealt with appropriately by at least some courts, particularly in Texas after the publicity surrounding the execution of Cameron Todd Willingham — hasn’t been adequately addressed. There are still jurisdictions where the old junk-science theories about arson are still allowed into court. And at least outside of Texas, even in jurisdictions that have barred the old theories from new cases, there has been little to no effort to revisit the countless prior cases where the flawed theories were used and to assess whether they’ve helped send innocent people to prison or to death row.

In fact, of all the dubious forensic fields Giannelli reviews, only bullet-lead composition has finally been rejected by most courts. That it and the others were ever allowed in is already damning indictment of the courts’ ability to distinguish good science from bad. That is, even if the courts had allowed the bad stuff in, realized it years later, prohibited it going forward, reviewed all the old cases to assess what damage may have been done and set free those who had been wrongly convicted, you could still make a convincing argument that the fact that the courts let the bad stuff in in the first place is a good indication that they failed in their duty. You could still make a convincing argument that it was a bad idea to entrust these decisions to the courts in the first place.

But it’s quite a bit worse than that. The fact is, judges continue to allow practitioners of these other fields to testify even after the scientific community has discredited them, and even after DNA testing has exonerated people who were convicted, because practitioners from those fields told jurors that the defendant and only the defendant could have committed the crime. In the few fields where the courts have finally admitted that they got it wrong, for the most part there has been little effort to systematically review all of the cases that those mistakes may have affected. It has largely been left to defense attorneys and nonprofit legal groups to find those defendants and file claims on their behalf.

Of course, none of this should be surprising. We don’t ask judges to perform regression analyses. We don’t ask them to design sewer systems, hit fastballs or compose symphonies. We know they aren’t qualified to do any of those things. Judges are trained to perform legal analysis. No one goes to law school to become a scientist. Few go to medical school or enroll in a Ph.D. program in the sciences because they have a penchant for law. The two fields represent two entirely different ways of thinking, are governed by two entirely different epistemologies and employ two nearly incompatible methods of analysis. And yet for some reason, we have decided that when it comes to the critically important issue of assessing the validity of expert testimony that could send someone to prison, or to the execution chamber, we will defer to the scientific knowledge of . . . judges.

The result has been catastrophic for the interests of justice but also entirely predictable.