In Chicago, just 32 food inspectors — called sanitarians — are responsible for auditing the city’s more than 15,000 restaurants.
Traditionally, sanitarians are assigned beats, or groups of restaurants, that they inspect a few times a year, depending on a restaurant’s assessed risk level: How complex a restaurant’s menu items are, and how likely ingredients are to trigger food poisoning. Today, the city is experimenting with a new technology to guide where those inspections should occur, based on factors such as current weather, nearby construction and past health code violations.
“We started thinking: How do we use predictive analytics and data to flip how we do that business?” said Jay Bhatt, chief innovation officer at the Chicago Department of Public Health, during a keynote speech at a recent predictive analytics conference in Washington. The department has been testing the food inspection model for the past few months.
Chicago is among a handful of cities trying to modernize their inspection protocol. Others include New York, whose Department of Health and Mental Hygiene is testing software that scans online reviews from Web sites such as Yelp, flagging mentions of potential food-poisoning incidents. In July, IBM unveiled an application aimed at public health officials that processes data, such as retail records and food poisoning reports, intended to trace incidents back to particular contaminated products.
Chicago’s predictive model is still in the pilot phase, said public health deputy commissioner Brian Richardson. Until the algorithm is more refined, the city will continue to deploy sanitarians based on traditional risk classification; but he noted that the health department is applying a similar predictive model to inspections for other public health risks, such as lead-paint exposure in residential buildings.
Currently, the software aggregates information from various publicly available data sources — records of building- and sanitation-code violations, demographic characteristics of nearby residents and lists of restaurants with liquor licenses, among others. It analyzes about 10 years’ worth of historical data, across about 13 variables, to determine which factors most strongly predict inspection failures. For instance, fluctuations in weather that might cause ingredients to rot were more strongly correlated with failure than a restaurant’s location or a history of past violations.
In tests covering several hundred restaurants, the software has helped inspectors identify 4 percent more critical violations of the health code than before they used the system, Bhatt said.
In its early stages, the analytic system is limited. In the case of food poisoning, it can only analyze the incident reports that restaurant patrons actually file. Last year, Bhatt’s team discovered that citizens are often more likely to post on social media Web sites about a bad meal than they are to file a formal report with the city. Bhatt’s team has developed a machine-learning program to mine Twitter for tweets including words linked with food poisoning, such as “vomit.” The department then responds to the people who posted the comments, encouraging them to file a formal report. It has since collected a few hundred additional reports through Twitter.
Eventually, Chicago’s predictive system may assess these Tweets and other publicly available social media sites, Richardson said. Currently, reports found through Twitter are sent directly to food inspectors, who then decide whether it is worth a follow-up inspection.
Chicago’s approach, and others like it, are a “relatively new phenomenon in investigating foodborne diseases, and it’s not yet clear how well it works in practice,” said Laura Gieraltowski, an epidemiologist in the Centers for Disease Control and Prevention’s Division of Foodborne, Waterborne and Environmental Diseases.
Still, they “could be useful in some contexts but as a supplement to other public health efforts to track foodborne disease,” she said.
Despite its limitations, the system can help the city allocate its limiting funding and staff to those areas most likely to be affected, Richardson said — to “solve the problem where it hurts,” he added.
For instance, Chicago uses a similar software model, based on publicly available data, to identify homes at risk for lead-paint exposure, especially those likely to be occupied by women and young children. Inspectors can then prioritize those homes for inspections and outreach, he said.
Clarification: This story has been edited to remove a suggestion that the effort to mine Twitter was funded by a $1 million grant. That money is being used for a larger effort.