There have been a number of efforts to track the flu with social media, including the recently criticized Google flu tracker. Now scientists from Johns Hopkins University and George Washington University say their approach, which uses Twitter, has proven highly accurate at the task.
During the 2012-2013 flu season, the technique was 93 percent accurate when compared to actual national flu data collected by the Centers for Disease Control and Prevention, and 88 percent accurate when applied in New York City. It also predicted the weekly change in whether the number of flu cases would increase or decrease with 85 percent accuracy, according to study published in December in the journal "PLOS One."
Because the CDC's data collection from hospitals and physicians involves a time lag, a system that uses Twitter might be able to reveal a spike in flu cases more quickly, said David Broniatowski, an assistant professor in the George Washington University's Department of Engineering Management and Systems Engineering. Broniatowski did much of the research with a team of colleagues while he was at Johns Hopkins.
"We’re actually able to track at the municipal level," he said. "We can provide them with the data about what the flu is like in their city. That allows them to do surge planning."
The main problem with using Google searches or tweets to determine flu incidence is that people use both to discuss the flu, rather than just complain about symptoms and exposure or mentioning medication, especially after news coverage of the flu. Broniatowski said his researchers developed an algorithm that separated chatter from useful information.
They did that by putting 10,000 tweets on Amazon Mechanical Turk, and paying people to determine whether each tweet was an actual complaint about exposure to the flu rather than just a reference to it. They then applied the algorithm to a much larger group of tweets. (Disclosure: Amazon CEO Jeff Bezos owns the Washington Post).
It's also possible to determine a tweet sender's location much of the time.
"Real-time tools such as our system," the researchers wrote, "have the potential to enable clinicians to anticipate the need for surges in influenza-like illness up to two weeks in advance of existing data collection strategies. Early knowledge of an upward trend in disease prevalence can inform patient capacity preparations and increased efforts to distribute the appropriate vaccine or other treatment."
Still unclear, Broniatowski said, is whether a Twitter-based method will prove accurate in rural areas, where fewer people use it.