The Washington PostDemocracy Dies in Darkness

Opinion Why early-voting data is an awful election predictor

Georgia Democratic gubernatorial candidate Stacey Abrams speaks to supporters at a campaign event as early voting begins. (Megan Varner/Getty Images)

The early-voting season has begun, and according to the University of Florida professor Michael McDonald’s count, some 9 million Americans have already cast their ballots either in person or by mail. More than 40 million have requested mail ballots.

As these early-vote totals grow, some data analysts are slicing and dicing the numbers, claiming that the early vote gives predictive insights into November’s results.

Be skeptical.

Election forecasts grounded in the early vote are plagued by problems: They have a poor track record, they lend themselves to conflicting interpretations, and the pandemic’s aftereffects muddy the data. Better election prediction tools exist. Rely on those instead.

Early-voting data is, at best, redundant — and at worst misleading

In 2020, for instance, early-voting predictions correctly forecast a close race between Donald Trump and Joe Biden in Florida, North Carolina and Arizona — as well as easy Biden victories in Oregon and Colorado. And in October 2018, Republicans led the early vote in several Senate races that eventually went for the GOP.

But in both years, poll-driven forecasts suggested these same outcomes before these early-vote analyses flooded the internet. So when the early-vote returns align with what the polls are already telling us, at best they can be used to confirm what we’re seeing in survey data.

Follow David Byler's opinionsFollow

And, at their worst, early-vote-driven predictions have been wildly misleading.

Some saw a Latino turnout surge in the 2016 early-vote data. Latino turnout barely budged that year, increasing by just two percentage points. Others used the data to forecast a Hillary Clinton victory in Iowa — a state where Trump led in the polls and ultimately prevailed by nine points. Similarly, some commentators argued that the early vote looked “good” for Clinton in North Carolina — but then Trump, who also led in polls there, won the state. The results were similarly uninspiring in 2014.

This year might be different. Maybe this time, the early vote will catch some trend that polls didn’t. But polls — with all their flaws — are typically more reliable predictors of the eventual result.

Early-voting data is too subjective

The second problem: Early voting is too vague to be useful.

Early-voting data reveals how many people voted — along with some demographic breakdowns of who exactly is voting — but doesn’t say how they voted. So analysts pair the data with subjective judgment calls: We decide whether the vote total for a key demographic “looks good” compared with some past election, or whether each party is hitting its “benchmark” in a state.

Bias seeps into these decisions. As my former RealClearPolitics colleague Sean Trende has argued: A right- or left-wing statistician can look at the messy static of early-vote data, selectively zoom in on one demographic group, cherry-pick races and find ways to craft an analysis that appears positive for his or her side. Nonpartisan analysts and good-faith partisans often unknowingly commit this sin. We want to find new insights, so — like a tea leaf reader or an astrologer — we read signs and omens into data that has no predictive power.

All data requires interpretation. But on the spectrum from art to science, early-voting data is too much like art to be reliable.

Early voting is a moving target

The third problem with the early vote: In many states, there’s no valid benchmark for a “good” or “bad” vote total for either side.

The 2018 election, the most recent midterm, provides little useful data. Much of the country increased its capacity for early voting during the pandemic, and many voters learned how to cast ballots early or by mail for the first time. So if the 2022 early-vote totals look “high” for either side compared with 2018, it might simply be an aftereffect of the pandemic.

Comparisons to 2020 present different problems. In 2020, Trump spent much of the campaign demonizing mail voting and pushing Republicans to vote on Election Day. It worked — Republican voters showed up on Election Day, while covid-fearing Democrats voted early.

But with Trump out of the White House, it’s not clear how many Republicans will maintain that habit — or exactly what a “good” GOP early-vote total is this year.

Some states established robust early-voting systems before the pandemic, while others rolled back some of the expansions they made during the pandemic. Predictions grounded in early-voting data from previous elections might be more reliable in those places. But in many states, voting laws and habits are a moving target — and comparing 2022 to 2020 or 2018 is fruitless.

Polls are still the best available predictor

If you want to know who will win the election, the best sources are still polls, poll-driven election models and expert analyses. None of these tools is perfect — but despite all the attention that is paid to the elections they get wrong, they have a better track record than the early vote.

If you feel as though you can’t trust the polls, the forecasts or the experts, there’s one foolproof method for knowing who is going to win: waiting until Nov. 8, when the results start coming in.