(Kacper Pempel/Reuters)

Is Twitter becoming a new public health database?

The latest evidence: A group of researchers has found that analyzing tweets can accurately predict the prevalence of heart disease.

In fact, the researchers say, Twitter can serve as a better predictor of coronary heart-disease rates than factors such as smoking, diabetes, income and education, obesity -- combined. The findings from the University of Pennsylvania were published this week in the journal Psychological Science.

The research is part of a larger effort to incorporate big data into science, rather than relying on the time- and cost-intensive process of collecting representative samples and conducting surveys. A previous study found that Twitter can be an especially good way to track the flu, and other research has shown that examining people's Wikipedia reading habits can accurately forecast the spread of influenza and dengue.

Using Twitter as a tool to measure public health can help policymakers more quickly and effectively target campaigns and measure their results, said the study's lead author, Johannes Eichstaedt, a graduate student and founding research scientist of the university's World Well-Being Project.

For this study, researchers mined geo-tagged public tweets sent from about 1,300 U.S. counties between 2009 and 2010, and used word filters and algorithms to sort the tweets by topics, such as hate, hostility and boredom. Then, the researchers looked at coronary heart disease death rate data from the Centers for Disease Control and Prevention.

It turns out that the tweets conveying negative emotions, such as anger and anxiety, correlated with higher rates of heart disease deaths; the opposite was true for positive emotions.

While it's already known that being irritable and hostile fills you with stress hormones that can lead to heart disease, these researchers say they can now point to tweets as a way to capture those psychological traits. "It's a pretty aggressive action to be cursing, to dropping the f-bomb on Twitter," Eichstaedt said. "This sort of hints at the behavior that these people" engage in.


(Courtesy of University of Pennsylvania)

But the people sending all those negative tweets aren't the ones dying from heart disease; the median age of Twitter is below the median U.S. population, Eichstaedt said -- and on average, Twitter users aren't at risk of developing heart disease.

Rather, Eichstaedt said, tweets can represent the overall negativity a community is feeling, partly as a result of environmental factors that make everyone stressed out -- the same kind of environmental factors connected to higher risks for heart disease. "These people are the canaries of the psychological profile of their communities," Eichstaedt said.

And then, there's the impact that grumpiness in a community (as expressed in these tweets) has on other people.

Think of the pink goo slime in"Ghostbusters 2," which served as the physical manifestation of New Yorkers' negative energy. As people became meaner, the goo became more powerful.

"Certainly, hostility and anger is very likely to spread person to person," Eichstaedt  said. "So even if we both live in the most beautiful neighborhood in New York City, and I'm really, really angry and I'm on the road with you, you will get some of that anger."

Many counties, particularly in the Midwest, were ruled out in this study because there weren't enough tweets and health data; but the coverage area examined by the researchers represents 88 percent of the U.S. population.

Next, researchers will look more closely at how positive emotions can protect people from heart disease, and what such optimistic tweets can tell us about people's physical health.


Heart disease mortality rates, as predicted by Twitter. Green represents fewer deaths and red represents more deaths. (Courtesy of University of Pennsylvania)

Heart disease mortality rates, per the CDC. Green represents fewer deaths and red represents more deaths. (Courtesy of University of Pennsylvania)