Wired published an interesting profile Tuesday morning that exemplifies what’s possible when smart people have access to good data. The piece showcases the work of a man named Fred Trotter who has accessed reams of buried Medicare data via a Freedom of Information Act request and is uncovering some potentially valuable information. Already, the article explains, he has built a “Doctor Social Graph” by analyzing some “60 million relationships between doctors, and how often they refer patients to one another.” His next mission is to build a doctor rating system based on data he’s uncovered about credentials, nursing home inspections and other relevant info.
Elsewhere, companies such as Palo Alto, Calif., startup Apixio are trying to make hospitals more efficient by using semantic analysis to connect the dots between patient charts, electronic medical records, billing data and whatever other sources of information that hospitals generate. (We covered Apixio in early 2011, although the company has significantly expanded its services since then.) In health care, everyone seems to have their own way of doing things, as Apixio natural-language-processing scientist Vishnu Vyas told me recently, so “the variety of the data becomes as important as the volume of the data.”
Linda Drumright, GM of the Clinical Trial Optimization Solutions group at IMS Health, agreed. She explained that her company is able to do its job because it has access to mountains of data from pharmacies, insurance claims, medical records, partners and other sources. All told, it houses 17 petabytes of data spread across 5,000 databases. Her division’s clients, which generally include pharmaceutical and biotech companies running patient trials, need all this data in order to ensure their trials will actually be successful.
One recent customer wasn’t able to recruit test subjects fast enough, she noted, and IMS helped it comb through its criteria about who to or not to include in the trial only to find “that the patient population they were looking for didn’t exist.” As IMS went back and began eliminating criteria and iterating design, it realized that trial never should have begun in the first place.
There are a million ways to think about how to use this data, Drumright said, and as more customers begin to fully understand what they can do with it, her goal is to “make this information accessible in a way where it’s easy at the point where it’s needed, and consumable where it’s needed.”
The key to curing cancer might be more data
But whatever Trotter, Apixio, IMS and others accomplish will have been made possible because they have access to some valuable datasets, albeit not always with great ease. Many individuals who’d like to improve the health care system — if not our health, generally — aren’t so lucky. Take, for example, the world’s genetic researchers. It’s very possible the data they need to discover the medical Holy Grail of a cure for cancer is locked in gene sequence data that only very few people will ever see.