Some of the findings aren’t exactly groundbreaking. Trump supporters, for example, are more likely to invoke the candidate’s name, the campaign-slogan hashtag #MakeAmericaGreatAgain and reference Sean Hannity’s website, Hannity.com. Clinton supporters, on the other hand, are more likely to mention Clinton, tag posts as #ImWithHer, or link to Huffington Post and CNN.
Other predictive terms are perhaps less obvious, though they make sense given each candidate’s rhetoric on the campaign trail. Tweeting the words “Islam,” “liberal,” “illegal” and “corrupt” suggest you will cast a vote for Trump, as do mentions of the websites for Reddit and the BBC, the researchers say. Terms that suggest you lean toward Clinton include the National Rifle Association acronym “N.R.A.,” the Southern plural “y’all,” and the words “humanity” and “rights,” as well as the hashtags #easychoice and #basketofdeplorables.
A full list of predictive terms, hashtags and websites can be found here.
Northwestern computer science professor Larry Birnbaum and a graduate student first debuted Tweetcast during the 2012 contest between President Obama and Mitt Romney. The algorithm was deployed a year later during the Norwegian parliamentary election, and has also been used to recommend local bars, restaurants and cultural activities to Twitter users based on their activity.
The concept of analyzing social media to gauge public sentiment has been around almost as long as the platforms themselves, and gained momentum as Twitter, Facebook and other networks emerged as popular places to gather news and express opinions. For all its promise, however, the technology is still inexact. Text analytics software has become increasingly sophisticated, but detecting positive or negative emotion from words alone remains a challenge. What’s more, active users on social media do not necessarily reflect the population at large.
Still, the Northwestern researchers analyzed the Twitter activity of 80,000 users in each of the 50 states to predict whether Clinton or Trump is more likely to take those electoral votes. They then compared their predictions against those at FiveThirtyEight.com, the statistics juggernaut known for its high degree of accuracy.
As with other recent polls, the Tweetcast results tilted heavily in Clinton’s favor — perhaps too heavily. The researchers predicted Trump would best his Democratic rival in just seven states: Mississippi, Alabama, Georgia, Louisiana, Texas, South Carolina and Florida. They even predicted wins for Clinton in conservative strongholds, such as Kentucky, Tennessee and Oklahoma.
Other polls suggest Clinton is unlikely to carry those red states and several others come Election Day, though more of them may be in play this election cycle compared to year’s past. Birnbaum admits polling purists will find much to pick apart in the Tweetcast predictions. For starters, Twitter’s user demographics are not reflective of the overall voting population.
“This is our estimate [of] if Twitter voted, this is what the vote would be. We can see pretty clearly it’s not what the polls say,” Birnbaum said. Still, he expects additional work to improve the accuracy of the state-level predictions will take place between now and 2020.
“It’s clear we have a lot more work to do there,” he said.
Read more from The Washington Post’s Innovations section.