For example, in February, the Hummel/Rothschild fundamental model gave the Democratic candidate a 48 percent probability of winning — and now gives the Democratic candidate a 53 percent probability of winning, because of Obama’s surging presidential approval numbers. But whether it’s 48 percent or 53 percent, that would be a relatively tight election.
That result, however, is only relevant if we imagine a generic Democratic candidate running against a generic Republican candidate — clearly not applicable to this year’s election. In this cycle, the likely candidates — as of this writing, in all likelihood, Hillary Clinton and Donald Trump — generate plenty of controversy. Polling and prediction markets on the other hand examine the actual Democratic and Republican candidates. And while the data aren’t extremely clear, our models suggest that the likely Democratic candidate is in firm control.
What do the prediction markets say?
Take prediction markets. According to PredictWise, a website that aggregates data from multiple prediction and betting markets, the eventual Democratic nominee has a 74 percent probability of winning the general election. That’s anything but a tight election — and an extremely high likelihood so early in an election cycle. At this time in 2012, Obama had about 60 percent; in 2008, the as-yet-undetermined Democratic candidate (the Democratic nominee hadn’t yet been decided) had about 60 percent. In both cases, they were correct.
But clearly, prediction markets and fundamental-based forecasts disagree this cycle.
Why do we see such a big difference this time around?
What do the polls say?
To understand, we looked at polling data. Hundreds of polls have asked Americans about the hypothetical presidential matchups between Clinton or Bernie Sanders and Trump, Ted Cruz or John Kasich. Hypothetical polls taken this early in the election cycle have a very weak track record in predicting elections. Historically they start to gain significance about this time in the election cycle, as each party’s actual nominee is becoming clear.
However, that’s not as true for the Republican race this year. Prediction markets still give Cruz, the trailing candidate, a 10 percent chance of winning the Republican nomination. Given what we know about people’s inability to think clearly about uncertainty, these what-if polls might mean something very different. Below we make a rough translation of the polls into probability of victory by considering the uncertainty inherent in the vote-share estimate. In that logic, if we had a survey of the entire voting population one day before Election Day indicating a 5-percent lead for the Democratic candidate, a Democratic victory would be close to certain, and this certainty would reflect in the probability of victory.
One telltale sign that these hypothetical polls are unreliable is how much variety we see in their results. Polls taken in April show Clinton leading Trump by as few as 5 percentage points and as many as 13. Other polls taken during roughly the same weeks and among very similar groups of people vary by more than 5 percentage points. Given what we know about how few swing voters there really are, such extreme variations over short periods of time are likely more noise than anything else.
Nonetheless, matchup polls in the last month or so show Clinton and Sanders with comfortable leads against Trump or Cruz.
We tried a different kind of polling.
To develop a more realistic estimate of voting sentiment, we teamed up with the mobile-based polling application Pollfish. Each week since January, we have asked a cross-section of voters a generic voter intention question: “Who are you most likely to vote for in the upcoming presidential election?” The possible answers were: definitely Republican candidate, likely Republican candidate, likely Democratic candidate, definitely Democratic candidate.
We wanted to know: Does our poll track the hypothetical matchups, the fundamentals, or neither?
The technical stuff. We adjust for coverage and non-response bias inherent in mobile opt-in polls by combining the advanced geo-location of mobile platforms with Big Data on all likely voters in the United States. We’ve developed a dynamic statistical model that yields probabilities of sub-demographic groups voting Democratic and is able to parse out noise from substantive movement. We then weight these probabilities based on the proportion of this sub-demographic in the likely voter space. We then make the same translation as we used for the hypothetical matchup polls to create probabilities of victory.
Here’s what we found.
The generic voter intention polling is steadily Democratic and trending upward for the Democratic candidates. It shows the Democratic candidate doing much better than the fundamental models would predict, and very close to this year’s market-based prediction, but worse than the hypothetic matchup polls suggest.
Our new data convinces us of two things.
First, the actual Democratic candidate is doing better compared with the actual Republican candidate than the generic Democratic candidate is compared with the generic Republican candidate.
Second, while the hypothetical matchup polls are showing a landslide, Pollfish’s voter intention question indicates a strong but not insurmountable lead for the Democratic candidate. That lines up relatively closely with the prediction market’s forecast.
The probable Democratic candidate appears to be starting the general election with a comfortable lead. Apparently, the Republican candidate — Trump or whoever it might be — isn’t doing as well as a generic (or more standard) candidate would be doing.
Tobias Konitzer is a Ph.D. candidate in communication at Stanford University.