The easiest way to get into this is with an analogy.
Let’s say you have a pool of 100 voters. A poll shows — accurately — that 50 of them like Candidate A, 40 like Candidate B, and 10 are undecided. A poll of all of these voters would show a 50-to-40 advantage for Candidate A. If the vote broke down exactly this way, with the undecided voters deciding not to vote, the final result would be an 11-point victory for Candidate A (since only 90 votes would be cast).
But let’s say those voters don’t go to the polls at an even rate. Let’s say supporters of Candidate A are lukewarm about him and only 60 percent of them turn out. Let’s say supporters of Candidate B are slightly more excited and turn out at 75 percent — meaning 30 of the 40 original supporters cast a vote. Suddenly, the 10-point margin is gone.
Then there’s the undecided population. Let’s say 7 of those 10 go to the polls, having made up their minds after the poll was conducted and, in doing so, split for Candidate B 4-to-3. Suddenly, Candidate B wins, with 34 of the 67 votes cast — a 1.5-point margin of victory.
Simply because of who turned out, our poll was 11.5 points away from being correct. It wasn’t wrong because it misjudged how people felt about the candidates; it was wrong because it failed to accurately predict who would actually vote.
There are, to oversimplify a bit, two important parts to polling. The first part is getting a sense for how people plan to vote. The second part is matching those results to who actually will vote. In the 2012 election, for example, many national polls assumed that the electorate would be made up of more voters who planned to vote for Mitt Romney. Those estimates were wrong, and President Barack Obama easily won reelection.
How’d they get it wrong? Pollsters use a number of different indicators to figure out who will vote: historic results, enthusiasm indicated by people in the poll, etc. During last year’s election, the New York Times did an interesting experiment in which they gave four pollsters the raw data from a survey they’d conducted; that is, they gave the pollsters just the first part of the poll as above. Each pollster then estimated who would turn out and evaluated the raw data in light of that. The result was a spread of 5 points among the pollsters.
The reason this is important in the moment is that Monday saw two new polls released on the Senate race in Alabama. One poll, from Emerson College showed Republican Roy Moore with a 9-point advantage in the race. The other, from Fox News, showed a 10-point margin — for the Democrat, Doug Jones. That’s a 19-point swing in polls that concluded one day apart.
In fact, recent polls in Alabama have been all over the map. Here are the results collected by RealClearPolitics, with the site’s average of recent polls also shown.
Most show Moore with an advantage, but a varying one. Not all show that, though.
You probably noticed that vertical gray bar labeled “SurveyMonkey” that includes nine different dots. Those dots aren’t nine different polls; they’re nine different interpretations of one poll conducted by the polling firm. (There are actually 10 results. The two that showed a tie are overlapping circles on the chart above.)
Here, from a great write-up of their experiment, are the 10 different results using 10 different turnout estimates. (Think of it as an internal re-creation of that Times experiment.)
You can see all the levers at play there. Registered voters vs. likely voters. Only looking at people who said they’re certain to vote vs. including those who said they probably would. Assuming the electorate looks more like 2014 vs. assuming it looks more like 2016. The end result is a 19-point spread — the same spread as seen between Emerson’s poll and Fox News’s.
Normally, there are a lot of ways to figure out which levers are the proper ones to flip. Last November was a presidential general election, meaning pollsters were warranted in assuming a higher-than-normal turnout. Things like that. But as we noted last week, there are a lot of reasons that this race in Alabama defies any easy way of determining who’s going to vote.
- It’s a special statewide election with few precedents that can inform estimates.
- It’s happening two weeks before Christmas.
- It’s happening during a year in which there’s an active blowback against a very unpopular president.
- The Republican candidate has been battered by serious allegations of sexual misconduct — that are layered on top of decades of controversial behavior.
Those are some of the questions at hand. Different pollsters will weight different factors differently — and the ultimate results will differ as a result.
Remember: Even during 2016, an easier-to-predict presidential election, four different pollsters took the same raw data and came up with several different possible interpretations of the same poll data. It’s important to also note that, unlike in that experiment, the recent polls in Alabama are also not using the same raw data, which compounds the variations in results.
For example, the Fox News poll found a huge advantage for the Democrat, Jones, among those contacted by cellphone.
“Alabama voters who were interviewed on cellphones are +30 for Jones, while the race is roughly even among all others,” Fox’s Dana Blanton wrote. “The fact that traditional, high-quality probability samples, like the Fox News Poll, include both landline and cellphone numbers may be why these polls show Jones doing relatively well compared to automated or blended polls.”
In other words, polls that exclude cellphones might underestimate overall support for Jones. The Emerson poll, for example, was landline only. (Calling cellphones is more complicated than calling landlines because of a federal law that mandates hand-dialing.)
For what it’s worth, the Post-Schar School poll completed at the end of November didn’t find the same chasm between those reached by cellphone and those called on landlines. On landlines, Moore and Jones were tied; on cellphones, Jones had a 4-point lead.
That different pollsters are coming up with a wide range of results is bad for trying to predict who will win Tuesday’s election, but it’s good news from the standpoint that the possibility of “herding” — pollsters weighting their results against what the broader environment of polls seems to show — seems to be diminished.
Whatever the outcome Tuesday, we will know two things about the polls that preceded it. First, that some polls, misfiring on turnout estimates, will have shown a very incorrect result. And, second, that this will be used to disparage polling in general as flawed and inaccurate.
Scott Clement contributed to this article.