How election modeling can help us understand who might win
Election models help close the gap between the vote count at a moment in time and the final result, based on trends they spot in the available data.
How America counts the vote — and determines the winner — of an election is now ripe fodder for a multitude of baseless conspiracy theories that undermine the confidence of voters in the democratic process. The 2020 presidential election kicked distrust in the U.S. electoral system into overdrive.
There are some reasons counting the American vote is confusing. We have a decentralized process and different states count different ballots in different ways and at different times, including before, after and on Election Day.
For voters reading and digesting election night news, raw votes reported by traditional news sources can be misleading, depending on how much of the vote has been reported at a point in time, and show one candidate leading when another might ultimately end up winning.
For all these reasons, The Washington Post is turning to election modeling — a complex but powerful mathematical tool — to help us understand how the vote is trending in real time. These models use complete results in a handful of counties and precincts, delve into those areas’ demographics, and estimate what the vote could end up looking like in similar counties or precincts. As we’ll explain below, we’re not making official race call projections or declaring winners solely based on the model’s data.
Let’s travel to Voteland, a fictional state heading to the polls to pick its new representative. Votelanders will decide between candidates from the Purple party and the Yellow party.
Votelanders are spread unevenly across seven distinct counties.
Urbanopolis and Urbanville are the most populated, diverse counties in Voteland.
Suburbia and Suburbana are middle-size counties next to the main metropolises.
Ruraltown, Ruralville and Ruralboro are less populous than the cities and their surrounding suburbs.
To provide some context as votes start coming in, fictional Voteland’s premier news publication — The Voteland Post — has created a statistical model, a set of formulas to estimate potential election outcomes for each party while the votes are being counted.
The Voteland Post’s election model is powered by a variety of data including regional demographics, household income and education level. The model looks at changes in vote results compared to previous elections and identifies patterns to estimate how other demographically similar places across Voteland may be voting.
In the real world, The Washington Post uses a similar model, developed by principal data scientist Lenny Bronner, to understand an election and to estimate outstanding votes still to be counted. The Post mainly uses precinct-level information to power its model. Unfortunately, many states do not provide these detailed results in real time, meaning The Post might fall back to county data — a less precise option — or skip modeling the race altogether.
Election night in Voteland
Let’s check back in on Voteland’s seven counties, where polls have just closed.
It is important to note the Purple Party won the last election in fictional Voteland and The Voteland Post’s model takes that into account. The Purples did so by performing well in their stronghold (the rural counties), staying competitive in the suburban areas and not falling too far behind Yellows in the cities.
The rural counties are usually the first to complete their count as fewer people live there and there are fewer ballots. With enough votes in, the model begins to visualize how the vote could turn out in other counties.
Purple takes the lead in the rural vote reported so far.
Now that it has enough data, the model can start estimating outcomes. On this chart, counted votes are represented by the solid bars on the left. The gradients, or fuzzy bars, on the right show the range of where the vote tally could end up according to the model.
The model notices fewer votes than in the last election. It anticipates Purple will win the demographically similar rural counties but with a smaller margin this time.
Some votes have been reported in suburban and urban counties. The model expects Purple might also underperform in these regions because the party is winning by a smaller margin than in past elections. Still, the range of possible outcomes is wide at this early stage.
As expected, Purple wins rural areas. While Yellow is behind in counted votes right now, the model believes it is more likely to win the election. Purple is receiving fewer votes than expected in suburban, and especially urban areas.
To be sure, there is still quite a bit of overlap in the fuzzy bars, meaning the model still recognizes many outcomes with each party winning more votes.
The more votes that are reported, the less uncertainty there is. Despite Purple currently leading the count, the model shows that Yellow will likely get more votes overall as urban counties, where more votes are out, finish tallying.
Preliminary results in cities show a solid lead for the Yellow Party as Purple is still underperforming. Looking at the model, Voteland Post readers can understand that the remaining ballots probably favor Yellow.
As anticipated by the model, Yellow wins the election thanks to its solid lead in the urban counties of this fictional state and Purple’s underperformance across the state.
- General type should start at 16px. Go up or down in major increments of 4 or minor increments of 2 from there.
- Type should be no smaller than 12px.
- If there are multiple graphics in a file with the same hierarchy, the headlines should be the same.
On desktop artboards (640 and above):
- Headlines should be 24px
- Any headline type over 24px should be reserved for special occasions and should be changed to Postoni.
For mobile art boards (384 and below):
- Headlines should be 20px
Before officially telling readers who is projected to win, The Posts — in Voteland and in Washington — carefully weigh available data. The Washington Post does not call winners and losers based solely on the election model. A team of editors and data experts assesses results from the Associated Press and Edison Research, state boards of election and the expected votes model to determine when The Post will report projected race outcomes.