Screen Shot 2014-12-18 at 9.28.48 PM

A few months ago we reported on a recently published article, “When contact changes minds: An experiment on transmission of support for gay equality,” by Michael LaCour and Donald Green, that was being talked about a lot in political science. LaCour and Green had claimed that a brief conversation by a political canvasser was enough to effect a huge change in people’s attitudes on gay rights (see graphs above).

As I wrote at the time:

What stunned me about these results was not just the effect itself—although I agree that it’s interesting in any case—but the size of the observed differences. They’re huge: an immediate effect of 0.4 on a five-point scale and, after nine months, an effect of 0.8.

A difference of 0.8 on a five-point scale . . . wow! You rarely see this sort of thing. Just do the math. On a 1-5 scale, the maximum theoretically possible change would be 4. But, considering that lots of people are already at “4” or “5” on the scale, it’s hard to imagine an average change of more than 2. And that would be massive. So we’re talking about a causal effect that’s a full 40% of what is pretty much the maximum change imaginable. Wow, indeed. And, judging by the small standard errors (again, see the graphs above), these effects are real, not obtained by capitalizing on chance or the statistical significance filter or anything like that.

Not to spoil the suspense or anything, but what really happened was that the data were faked by first author LaCour. Co-author Green (my colleague at Columbia) had taken his collaborator’s data on faith; once he found out, he firmly retracted the article. Details at Retraction Watch.

It would be easy to criticize Green for not looking at the data more carefully, but . . . that’s easy to say after the fact. In all my collaborations, I’ve never even considered the possibility that I might be working with a Diederik Stapel. And, indeed, in my previous post on the topic, I expressed surprise at the published claim but no skepticism.

Ironically, LaCour benefited (in the short term) by his strategy of completely faking it. If he’d done the usual strategy of taking real data and stretching out the interpretation, I and others would’ve been all over him for overinterpreting his results, garden of forking paths, etc. But, by doing the Big Lie, he bypassed all those statistical concerns. Note my comment above, “judging by the small standard errors (again, see the graphs above), these effects are real, not obtained by capitalizing on chance . . .” It was easy for me to be skeptical of the claim that subliminal smiley-faces influence political attitudes, given that the data that were used as evidence didn’t strongly support the hype (which, to be clear, was in any case not being hyped by the author of the study, who made only modest, reasonable claims from his experiment). But those were real data. In LaCour’s case, he was able to shape the data to get devastatingly strong evidence.

The message, I suppose, is to be aware of the possibility that someone’s faking their data, next time I see an effect that’s stunningly large.

The funny thing was, I did come up with a story as to how this implausible effect could’ve occurred:

Public opinion on same-sex marriage and other gay-rights issues has been very fluid during the past 15 years, especially so during the period of the survey. Lots of Californians were going to change their opinion to be more favorable to gay marriage, and average opinions were moving steadily in this direction. The experimental condition kicked people faster along this path. . . . That is, I see the effect of the treatment not as shifting people’s attitudes but rather as changing the timing of attitude shifts that were in the process of occurring.

When I shared this theory with Green, several months ago, he agreed that it made sense but he noted that the data did not see such a large shift in the control group, which suggested that not everyone was shifting in that way. In retrospect, of course, everything makes sense, given that the data were fabricated.

It’s an interesting aspect of science, that we can work hard and come up with stories for anything. Indeed, I published a political science paper several years ago that I later retracted, not because the data were faked, but because we had miscoded one of the variables, and it completely destroyed our analyses and conclusions. So these things happen. Another example would be the “cold fusion” fiasco from 1989, where physicists jumped to explain the stunning experimental results. Lots of theoretical ideas, then it turned out the claimed results never happened. But that’s the way theory goes: you stretch your ideas to explain the unexpected, then see what happens next.

A bit of good news from this case is how quickly it was resolved. The article came out in December 2014, the fraud was uncovered in May 2015, and the paper has already been retracted. And the incentives are all in line for us to be more careful about such claims in the future. The article appeared in the journal Science, which, along with Nature and PNAS, is sometimes called a “tabloid” because of its pattern of publishing dramatic but fishy claims (at least in social science; I can’t comment on the contents of these journals in biology, chemistry, physics, etc.). Maybe this event will shake them into being a little less about flash and more about substance.