Specialists have done plenty of postmortems. That includes an extensive evaluation by a task force of the American Association for Public Opinion Research and a New York Times Upshot summary of various investigations. Many, naturally, have called for more and better state polls.
But we propose a promising alternative: Use a large, high-quality, national pre-election survey to make state-level predictions. Our technique proves highly accurate in predicting the 2016 election, and previous presidential elections back to 2000.
Here’s how the technique works
The technique, explained in a paper we presented recently at AAPOR’s annual conference, relies on “multilevel regression with post-stratification,” or MRP. It’s a way of using data sets that cover a large area — like a country — to make estimates in just a portion of that area — like a state. That makes it perfect for using national data to project election outcomes at the state level.
MRP works best when groups behave predictably. American elections fit the bill. Voter turnout and vote choice among demographic groups — especially racial and ethnic groups — are similar across state lines, especially when we account for factors such as age, sex and education.
In the election context, a good, big poll lets us reliably estimate voter turnout and vote choice for specific groups at a quite granular level — say, 18- to 29-year-old non-college-educated Hispanic men. (Our model used 10,200 such demographic combinations in all.) We combine these estimates with census data telling us how many individuals in each of these groups reside in any given state and add factors such as previous election turnout and vote preferences. The end product: state-level estimates of turnout and vote for the current election.
Our model accurately predicted Trump’s victory — and the results of previous elections
We experimented with this technique using the 2016 ABC News and ABC News/Washington Post tracking poll that we produced for ABC News, consisting of 9,485 interviews from Oct. 20 to Nov. 6. The results are striking. Our model, based on pre-election polling, correctly predicts the 2016 winner in 49 of the 50 states (as well as the District of Columbia), missing only Michigan, a state President Trump won by just 10,704 votes out of 4.8 million cast.
Our model also puts Trump ahead in the electoral college — at a time when the election forecasters were predicting a Hillary Clinton win with anywhere from 71 to 98 percent confidence. At the national level, our model estimates the popular vote within four-tenths of one percentage point.
Some of this surely is luck. There were a lot of very close state races, and even though our method projects the right winner in all cases but one, our state-level vote estimates are not exact. But we also do very well when we apply MRP to previous elections, using ABC News or ABC News/Washington Post tracking polls from those contests. Our models correctly predict 49 of the 50 states and D.C. for 2012; 47 for 2008; 48 for 2004; and 47 for 2000. We predict the correct popular vote winner, and the correct electoral college winner, for all except 2000.
We outperform other prominent models
For the 2016 election, other models did less well than ours. YouGov, using its own MRP model, and Survey Monkey correctly predicted the outcomes in 43 states; the New York Times Upshot in 45; and FiveThirtyEight and HuffPost in 46. All those picked the wrong electoral college winner. Further — getting technical just for a moment — our root mean square errors, a measure of accuracy, are much lower, including just 2.5 points on the vote margin in the battleground states, vs. 3.9 to 5.5 in others’ estimates.
The reason may be that other estimates relied on poorly executed or poorly timed state polls, non-probability online samples and/or less effective pre-election polling techniques. We had a high-quality, random-sample data set to work with; the ABC/Post poll is one of only six surveys rated “A+” by FiveThirtyEight, out of 372 it evaluated.
Our results raise questions about exit poll data
Our MRP estimates also are a useful counterpoint to the exit poll sponsored by a consortium of national news organizations. Nationally, our model finds an older and less-educated electorate than indicated by the exit poll, and a notably larger margin for Clinton among white college graduates. For instance, Trump was +3 points among college-educated white men in our estimate, vs. +14 in the exit poll; and Clinton won college-educated white women by 18 points in our estimate, vs. +7 in the exit poll. Given challenges in conducting and weighting the exit poll, our results are worth a good look.
Our method doesn’t solve everything
Like any method, MRP and our specific application of it, has its limitations. It relies on the predictive power of demographic variables. We need separate models for everything we’re predicting. And we need a good, big data set; they’re expensive.
Most important, MRP doesn’t help with the main reason we conduct pre-election polls — to understand the substance of voters’ choices by measuring their predispositions, policy preferences, and views of candidate attributes. We’ll ultimately know who wins an election. But without thoughtful polling that goes well beyond the horse race, we’ll never know why.
Whatever the research goal, a clear lesson of the 2016 election is that relying on state polls is a risky enterprise. Some are done poorly, and we may not have good ones where we need them. The beauty of MRP is that given a large, high-quality data set, it predicts all states — and does so very accurately.
Gary Langer is president of Langer Research Associates.
Chad Kiewiet de Jonge, PhD, was a senior research analyst at Langer Research Associates during the 2016 election season.