By Jon Cohen
Sunday, November 2, 2008
January, you may recall, was a rough month for the pollsters. All the polls showed Sen. Barack Obama poised to follow up his big win in the Iowa caucuses with a knockout blow to Sen. Hillary Rodham Clinton in the New Hampshire primary. But he lost, sending the 13 firms that did public pre-election polls there scrambling for explanations.
Could polling be similarly embarrassed this month, misjudging the last chapter of this epic presidential election? Thoughts of the Granite State jolt me and my fellow pollsters awake in the dead of night during these final days.
Sen. John McCain certainly says that the polls are misleading, arguing that most surveys have "consistently shown me much further behind than we actually are," as he put it last Sunday on NBC's "Meet the Press." Of course, blaming the polls is standard operating practice for trailing candidates, their partisans and contrarians everywhere. And of course, we won't make the same mistakes that led George Gallup to declare that Thomas Dewey had Harry Truman beat in 1948: We won't use outdated sampling techniques, and we won't assume that the race is over and stop polling. Nor is our polling window as absurdly small as the four nights between Iowa and New Hampshire. Even so, could McCain be onto something? In the latest Washington Post-ABC News tracking poll of likely voters, Obama has a nine-point lead, larger than Dewey's five-point margin in the late October Gallup poll in 1948. Other reputable national polls this year show similar or even larger Obama leads. But could we still make big mistakes? Can the polls be trusted?
As the polling director of The Washington Post, I get that question just about every day, even in less intense periods than this. Some question the scientific basis of polling, refusing to believe that interviews with hundreds or thousands of randomly selected respondents could accurately represent the opinions of many millions. Others see basic bias (refreshingly, these accusations come from all sides) or point out new wrinkles, such as the growing number of adults in the United States who have only cellular phones that pollsters mostly don't call. Added to the mix this year is a lingering skepticism about the accuracy of polling contests between white and black candidates -- doubts that persist despite decades of data suggesting that these polls perform no worse than others -- and heightened concerns about the way we define "likely voters."
I don't consider any of these fatal or even very serious problems, but that doesn't mean I'm immune to pollster's paranoia. We could all be wrong -- at least theoretically.
Simply put, we may be wrong about who is likely to vote on Tuesday. One of the trickiest parts of political polling is determining which of the people interviewed in pre-election surveys will really vote. It's relatively easy for us to identify such sharply delineated groups as the population of all adults living in the United States or even all registered voters, but the pool of actual voters is a group that exists at a single point in time, on Election Day (plus those casting ballots early and by mail).
Even a few days away from an election, that group remains an unknown population. Not everyone who says that they will vote will actually do so, in part because when asked about their intentions, people want to sound like good citizens. So pollsters develop models to whittle down their samples to account for people's tendency to overstate these things.
These "likely voter models" vary widely from pollster to pollster. This year, the Gallup Organization is publishing two models. Its "traditional" model factors in respondents' reports of whether they voted in previous elections to determine who is a "likely" voter. But Gallup's new, "expanded" model drops this requirement, putting more young and minority voters into the "likely voter" category.
In Washington Post-ABC polling, we ask a series of questions about whether and how people plan to vote, whether they have voted before and basic knowledge about the voting process. We then feed all this information into a range of models, corresponding to different levels of turnout. We report a single model, but only after assessing the quality and impact of all of them. Likely voter modeling is a craft, bolstered by science.
I'm also often asked about the rising use of cellphones. The number of people ditching their home telephones has spiked in recent years, with the highest percentages among young adults and nonwhites. Does this affect the polls? Probably not -- or at least not yet. The exclusion of cellphone users appears to have no more than a minimal effect on the results. But even if these voters turn out to have been systematically underrepresented in this year's polls, that would actually mean that Obama had an even larger lead, because these voters overwhelmingly back him over McCain. And both the Gallup and Washington Post-ABC tracking polls interview complementary samples of voters who have only cellphones to make sure that we're not missing something. (Few state polls include cellphone interviews.)
Others who doubt this year's polls raise the question of a "Bradley effect." This syndrome gets its name from the bid by Tom Bradley, then the African American mayor of Los Angeles, to become governor of California in 1982. He headed into Election Day with a big seven-point lead in the last publicly released poll, only to lose narrowly. Some attributed this startling result to a quiet form of racism that revealed itself only in the voting booth, and the 1982 case has been trotted out ever since to cast doubt on the accuracy of pre-election polls in contests between white and black candidates.
But there is good reason to doubt that racism was the cause of Bradley's defeat, and decades of polling in other campaigns with black candidates should mute some of the skepticism. No "Bradley effect" has shown up for years, and a new analysis by one Harvard University researcher, Daniel Hopkins, shows that any such effect that existed in African American politicians' contests in the late 1980s and early 1990s has now disappeared.
There is also the possibility of a pre-election "bandwagon effect." Post-election surveys frequently overstate support for the winner, and with 70 percent of respondents in a recent Gallup poll saying that Obama is headed for victory on Tuesday, perhaps voters are beginning to overstate the likelihood that they'll vote for him. (There's no precedent for something like this happening, but hey, it's been a weird year.)
I worry more about a basic concern: whether we are getting a truly random sample of opinion. Pollsters bank on the fundamental notion that the people who answer our calls are similar to those who don't, and we have reams of data justifying those assumptions. But what if the people who pause to take a pollster's question are significantly different from those who don't?
After all, fewer and fewer people have been taking our calls over the years. The Pew Research Center, which has done extensive research into declining survey-response rates, has found that poorer, less educated whites -- who tend to hold somewhat less positive views toward African Americans -- are also harder to get on the phone than those who have higher incomes and more formal education. My fellow pollsters and I give this a pretty academic name, "differential nonresponse," but it's a live, practical concern.
Despite my list of worries, a few things remain clear to me: Not all polls are created equal. We've been bombarded with polls that fell far short of the methodological rigor required for a good survey. If you mix in bad polls with the good ones, as happens all over the Web, you just may get dodgy results.
I also remind myself that humility is built into my field's DNA. The mathematics of the "random sample" on which all polling is based says that five times out of 100, we will be badly off the mark. Call it the pollster's law of averages.
But these seem to be topics for another day. The polls appear to be in general agreement that Obama is ahead; the only question is by how much. And this time, the pollsters' findings are being reinforced by the work of two other groups of campaign obsessives: the political scientists who use predictive models drawn from past election results to predict the next one (the one professor whose forecast had McCain headed for victory has "adjusted" his model), and the reporters out there knocking on doors and interviewing voters.
That reassures me because it suggests that 2008 is not like 1980, a year in which some late polls showed a close race between Jimmy Carter and Ronald Reagan. But shoe-leather reporting -- such as the 50-state roundup, led by David S. Broder, that The Post published the Sunday before the election -- found that Reagan was surging. Today, all the indicators -- not just the polls -- suggest that Obama is the candidate with momentum. If that changes over the final days, quality polls will still be the single best gauge of why things shifted.
Of course, if the trustworthy polls continue showing Obama ahead and McCain wins, it would be a monumental failure for political scientists, reporters and pollsters alike -- an indictment worse than New Hampshire, worse even than 1948. I think the quality pollsters have done a good, professional job this year. I don't think we'll get bitten. Even so, I'll be a little worried until it's all over. I'm not sure what kind of night I'll have on Tuesday. But I'm sure I'm going to have a nervous one on Monday.
Jon Cohen is The Washington Post's polling director.