Sharon Begley is the senior science writer at STAT. Her most recent book is “Can’t. Just. Stop.: An Investigation of Compulsions.”
Over the past few years, artificial intelligence has stormed into health care like a consultant from hell, promising to improve quality, cut costs, increase productivity, eliminate diagnostic errors, catch impending strokes before they happen, distinguish benign blips on a mammogram from breast cancer and generally usher in a halcyon era of medicine.
Or so claim the more ardent boosters of AI. Open “Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again” to almost any page, and you’ll read author Eric Topol’s encomiums of deep learning, in which computers ingest zettabytes of big data and, using a layered algorithmic formula to analyze it, spit out answers about whether, for example, a CT scan indicates cancer or an MRI is evidence of depression. Already the technology, in at least some applications — such as analyzing retinal scans for signs of diabetic retinopathy and skin lesions for hints of melanoma — provides diagnoses and treatment recommendations. Through deep learning, advocates say, AI soon will be able to discover new drugs, construct personalized diets based on individuals’ genetics and other data, and one day make hospital stays obsolete.
Topol — a cardiologist, director of the Scripps Research Translational Institute and paid adviser to two AI health companies — isn’t advocating handing over medicine to the machines completely. Instead, he argues that using AI will, paradoxically, make medicine more humane. For one thing, he writes, the technology could make medicine more efficient, allowing doctors to spend more time with patients. “The rise of machines has to be accompanied by heightened humaneness — with more time together, passion and tenderness — to make the ‘care’ in healthcare real,” he notes. It’s a noble wish, endorsed by veteran doctor Abraham Verghese in a praising foreword.
Give Topol credit for timing. Hardly a week passes without another study describing a triumph for AI in medicine. Among the promises heralded in these pages: diagnosing some forms of melanoma “even better than board-certified dermatologists,” identifying heart rhythm abnormalities as expertly as cardiologists and scrutinizing pathology slides for signs of cancer as well as experienced pathologists.
Medical AI systems use a similar process to the one that teaches driverless cars to recognize pedestrians, other vehicles and stop signs. Just as Waymo’s algorithms are based on what millions and millions of people look like — in shadow and light, tall or short, running or walking — in order not to run them over, so AI systems in medicine are trained on images and what they mean. For example, they suck up thousands of retinal images that mean diabetic retinopathy and thousands that don’t, and images of thousands of moles that mean melanoma and thousands more that don’t, in each case learning which features indicate the presence (or absence) of serious disease. The result, in an ideal world, is faster, more accurate diagnosis; with minimal human input, the technology also enables advanced medicine to reach underserved areas.
Some of these AI systems have proved viable: Last year the Food and Drug Administration approved the sale of one used to read head CTs and diagnose stroke, and in 2017 the agency approved an AI system for reading heart MRIs.
To his credit, Topol points out that most deep-learning successes have come in best-case settings. “The field is long on computer algorithmic validation and promises but very short on real-world, clinical proof of effectiveness,” he tells us. He also includes examples of where AI blew it. One of his own patients elected to undergo stenting of a coronary artery to treat severe fatigue, and it worked — even though an AI system ingesting all the medical knowledge in the world would have said don’t do it. He points to studies of algorithms that have found “far more false positives than a human would make” in detecting some types of cancer.
If you’re the type who remembers past techno-predictions and wonders, “Where’s my flying car?,” “Deep Medicine” may give you permanently raised eyebrows. Take the Apple Watch: The device can detect changes in heartbeat that might mean atrial fibrillation, but in at least one test, it had an accuracy rate of only 67 percent. More fundamentally, since much published research — by one account at least half of medical studies — is wrong, and one basis of deep learning is these studies, how will AI know which ones to ignore? And since almost every association between genetic variants and risk of disease comes from studies of white Europeans, how do we tell AI not to apply those findings to Asian or African patients? Aware of both the promise and the limitations of deep learning, federal regulators are rushing to establish oversight of this use of AI in medicine, with the FDA announcing Tuesday that it will begin formulating rules for the systems.
Topol has clearly thought deeply about all this, but “Deep Medicine” would have been even better if he’d explained how we’ll get from today’s highly imperfect AI systems to the brilliant ones he calls inevitable. He notes “how challenging it will be for AI to transform medicine,” but the reader is left with something like that old New Yorker cartoon: equations on the left, equations on the right, and in the middle, “Then a miracle occurs.”
Since making AI work is a scientific and data problem, that miracle would presumably involve scientific and data solutions. But there are other AI drawbacks that arise from more nefarious threats: hacking and data privacy, as well as “the potential to deliberately build [AI systems] that are unethical, such as basing prediction of patient care recommendations on insurance or income status,” Topol notes.
Too cynical, you scoff? I give you electronic health records. EHRs were supposed to reduce medical error, avert duplicated tests and prescriptions, and otherwise improve patient care, but they have not. Rather, they have enabled doctors and hospitals to wring the last possible penny out of patients and other payers, as Topol laments, such as by finding every single billable code (and suggesting the most expensive ones). As I write this, a study by researchers at MIT and Harvard has concluded that a deliberate, “small, carefully designed change in how inputs are presented to [an AI] system [can] completely alter its output, causing it to confidently arrive at manifestly wrong conclusions,” such as calling benign moles malignant. How’s that for a dermatologist income booster — and for instilling fear (and lack of trust) among patients?
Topol is a dreamer. “One can imagine that AI will rescue medicine from all that ails it, including diagnostic inaccuracy,” he writes. (There are roughly 12 million misdiagnoses of serious illness in the United States every year, and medical error kills a quarter-million Americans annually.) But even Topol admits that this hope is far from being actualized. Indeed.
By Eric Topol
378 pp. $32