Barzilay had spent years researching the AI specialty known as natural-language processing, which applies algorithms to textual data. Those skills, she realized, might be put to a different use: predicting cancer. She decided to shift her research.
That choice is now bearing fruit. Barzilay, 51, and a student protege have built an AI that seems able to predict with unprecedented accuracy whether a healthy person will get breast cancer, in an innovation that could seriously disrupt how we think about the disease.
As she and her team laid out in an article in the Journal of Clinical Oncology last month and explore further in an upcoming piece set to be published in Nature Medicine, by analyzing a mammogram’s set of byzantine pixels and then cross-referencing them with thousands of older mammograms, the AI — known as Mirai — can predict nearly half of all incidences of breast cancer up to five years before they happen.
It’s a marriage of tech and health care that could alter millions of lives without a single drop of medicine. “If the data is validated, I think this is very exciting,” said Janine T. Katzen, a radiologist at Weill Cornell Medicine who specializes in breast imaging.
Assuming that validation happens — trials are about to begin — Mirai could transform how mammograms are used, open up a whole new world of testing and prevention, allow patients to avoid aggressive treatments and even save the lives of countless people who get breast cancer. (Men and nonbinary individuals also are affected.) Mirai would spit out risk scores for patients’ next five years, giving them a chance to make health-care choices that earlier generations could only dream of.
The AI has an oracular quality: The designers themselves don’t understand how it works. They’re just certain that it does.
That fact raises many broader social and moral implications. But there’s also a more practical matter — whether the medical establishment and insurance companies will at all embrace this.
Any family that has been affected by breast cancer knows the trajectory: A person is feeling perfectly fine when a mammogram or self-examination turns up a troubling sign, jolting everything to a stop. An MRI or biopsy then confirms the suspicion.
Suddenly rushing in are fears about the future, flurries of doctor appointments assessing the threat and many months of debilitating treatments and surgery. Even in cases with a “successful” outcome, physical and psychological aftereffects — along with paralyzing fears of recurrence — can last years.
Through it all, a question gnaws: How could a body betray us without offering up so much as a warning message?
Barzilay asked another question: What if it does and we just haven’t built the tools to hear it?
The system most often trying to listen has been Tyrer-Cuzick, a statistical model into which doctors input a list of basic variables such as a person’s age and family history. It usually predicts breast cancer in just 20 to 25 percent of people who go on to be diagnosed with it.
MIT researchers took a different tack. The team — Barzilay; the student, Adam Yala; and Connie Lehman, a Mass General doctor Barzilay met through her oncologist — gathered more than 200,000 Mass General mammograms of people who would and would not go on to develop cancer. They fed them into Mirai to train its algorithm. Mirai would scan mammograms and make a prediction, drawing from all it had analyzed.
Then it would be told the actual result and be “penalized” or “rewarded” (via the mathematical adjustment of the model) based on the deviation from the reality. It quickly learned what future breast cancer did and did not look like in the mammogram dots.
Once Mirai was trained, team members embarked on a study. They collected 129,000 mammograms taken from 2008 to 2016, spanning 62,000 patients in seven hospitals in five places — Sweden, Israel, Taiwan, Brazil and the United States — and asked Mirai to make its predictions. Anything above a cumulative five-year risk score of 2.5 percent was deemed high, and the AI would then automatically recommend further testing such as a biopsy or MRI. How well, the team wondered, could Mirai predict which mammogram belonged to a person who developed cancer over a five-year period?
The AI was correct in an average of about 76 out of 100 cases, an improvement of 22 percent over Tyrer-Cuzick, translating to millions of women in the real world.
Mirai’s “sensitivity”— the rate at which it correctly foretold cancer in all those who would go on to be diagnosed with it — was about 44 percent, nearly double Tyrer-Cuzick’s 20 to 25 percent. (The study did not distinguish between more and less aggressive forms of cancer.)
“This is the next, very positive step forward,” Dorraya El-Ashry, chief scientific officer for the Breast Cancer Research Foundation, said in an interview. “There is a lot of work to do. But it is very encouraging.”
“It was never a question,” Barzilay said. “This should be for everyone to build on.”
A different tack
The mammogram is a little bit like Winston Churchill’s democracy: It’s the worst screening method, except for all the others. The approach — which uses low-grade radiation to examine breast tissue from multiple viewing angles — has become the gold standard over the past several decades, and many medical professionals swear by it as an uncomfortable but important safeguard. It also has drawn its share of critics in the oncology and women’s health communities who say it has led to unnecessary radiation exposure, overtesting, false positives and all the stress that comes with them.
Barzilay and her team say that the problem lies not in the mammogram but in how it is being used. Right now, human radiologists — able to see only so much — focus on factors such as breast density, a notoriously unreliable marker because dense breasts are common in many healthy women, too.
The researchers say the machine can see a lot more. “The mammogram is such a rich source of information. I just don’t believe it’s been mined for all its potential,” Yala said, noting it could get even better with the advent of the burgeoning “3-D mammogram,” a process known as tomosynthesis.
Lehman says it is not the tool but the approach that has been the issue. “We don’t need to do age-based screenings — we can do risk-based screenings,” she said. The overall number of mammograms probably would be the same, but instead of all women over 40 getting them annually, some women under 40 who are at higher risk would get them, while low-risk people over 40 would get them less often.
The Mirai team also hopes the AI will better represent women of color.
“When you start seeing the data about racial bias in traditional risk models, it’s chilling,” Lehman said. “And the reason is because they mainly take into account European Caucasian women and not Hispanic, Asian and Black women. I’ve seen with my own eyes how racially biased traditional risk scores are.”
Many worry that AI might integrate similar biases, because, well, it’s being programmed by the same people who design the math models. But Yala said the results of the study did not show bias; the rates of cancer it found among its many subjects in Asia, South America and the Middle East, and in hospitals with a significant number of Black patients in the United States, were consistent with actual real-world rates.
The breast cancer statistics are alarming across the board. One in 8 American women will be stricken with the disease at some point during their lifetimes. While many cancers, such as lung cancer, have been declining in the United States, breast cancer rates have been going up — an annual average of half a percentage point between 2008 and 2017, according to the American Cancer Society.
When Barzilay was first starting her research, most hospitals turned her away, saying breast cancer had been treated for years without AI.
“I felt like I had something really important to give,” said Barzilay, who has what might be described as an affable indomitability, a boundary-pushing researcher crossed with Gal Gadot. “And they acted like I was trying to sell snow to an Eskimo.” So she enlisted Yala, at the time still an undergrad, who set out on the laborious door-to-door task of wheedling for access to anonymous mammograms.
A small breakthrough came when Barzilay was introduced to Lehman, allowing them to get hold of Mass General’s files. The two women soon became a well-known brand in cutting-edge breast cancer circles — “Regina and Connie,” a kind of single phrase indicating a complementary duo.
Barzilay and Yala are both from places far removed from Massachusetts’s medical community, which may be key to their disruptive mind-set. Barzilay was raised in Moldova and immigrated to Israel at 20 after the fall of the Iron Curtain — she was picking almonds on a kibbutz when most Americans her age were on spring break — before coming to the United States nearly a decade later. Even her first name — pronounced with a hard “g,” as in “regulate” — has a kind of edgy contrarianism.
Yala, meanwhile, was born in Algeria and arrived as a 10-year-old in the suburbs of Chicago after his parents fled political instability at home.
“I guess coming from somewhere far away made me not accept the status quo way of doing things,” the 26-year-old said. “And it definitely helped me go around the world begging hospitals for mammograms.”
The ethical implications for Mirai are significant.
Sarah Eskreis-Winkler, a radiologist at Memorial Sloan Kettering Cancer Center’s breast imaging center and head of that center’s artificial-intelligence division, said she is bullish about how it can transform preventive care.
But she also noted many tricky issues that have yet to be worked out. “Here’s the scenario I’m interested in,” she said. “If Tyrer-Cuzick says a person is high-risk and Mirai says they’re not, who should they listen to?” After all, if the AI is wrong, it could create the optics that a machine hurt a human.
Much like self-driving cars, Barzilay and Lehman say, the machine does not have to eliminate error in every single case. It simply has to be marginally better than humans in the total number of cases.
There is also the black-box question. Many scientific enterprises at least allow researchers to know, eventually, how it roughly works. But Mirai presents the possibility that millions of women will be told what to do about their health for reasons no one understands.
“We don’t really know exactly how aspirin works, yet we use it all the time,” Yala said. He noted that amorphous recommendation engines are in wide use for everything from shopping to streaming. “But when it comes to medicine, where we need it most, we insist on humans.”
Advocacy groups say they are unconcerned about the black-box issue. “Knowledge is power, wherever the knowledge is coming from,” said Elana Silber, executive director of Sharsheret, a group focusing on Jewish women affected by breast cancer. “If people can understand their risk better, they can take measures to protect their health and save lives.”
The research has also earned the cautious endorsement of large-scale medical groups. Robert Smith, senior vice president for cancer screening at the American Cancer Society, said that he sees Mirai as a “very good thing” and that it “does appear to offer advantages,” though he said that “we need to move forward carefully.”
Many radiologists in the field are enthusiastic too. Katerina Dodelzon, Katzen’s colleague at Weill Cornell, noted the technology’s ability to take radiology “from diagnostic to prognostic” functions.
The same optimism may not yet have taken hold with breast cancer surgeons or oncologists, who most directly advise patients on breast cancer risk. Requests for comment to such doctors at four high-level hospitals were declined, and one hospital staffer described an ambivalence among that group. Mathematical models are common in cancer treatments such as chemotherapy dosages, but that is more familiar to physicians than outsourcing a prognosis to a computer.
Even some radiologists are conflicted, fearing automation could take their jobs. Some more-traditional detection-related technologies — machines meant to identify cancers already present — are in various stages of research or deployment by Google, the Dutch start-up ScreenPoint and the British company Kheiron Medical. Those efforts have caused some consternation in the radiology community.
While emphasizing that these technologies are meant merely as a tool for the human reader, Tobias Rijken, chief technological officer and co-founder of Kheiron Medical, also pointed to a machine’s comparative advantage in the life-or-death effort of breast cancer imaging. “An AI works 24/7, it doesn’t get tired, and it doesn’t have personal problems at home,” he said.
Among the sites for the planned Mirai trials are the Mexican hospital network Grupo Angeles and Novant Health, the sprawling southeastern U.S. health-care system. Novant aims to roll out the trials in the coming months at its flagship hospital in Winston-Salem, N.C., where as many as 150,000 patients who come in for mammograms will be given risk scores produced by Mirai.
The hurdles will arrive with them.
In most cases, insurance companies pay for mammograms only for people over 40, and there has even been a push by some U.S. companies in recent years to raise the age to 45 or even 50. Upending the system to pay for mammograms for women in their 30s will not be easy. Many also will not pay for a breast MRI recommended by an AI.
“It’s the biggest challenge we have: Will insurance pay?” said Bipin Karunakaran, vice president of clinical insights and analytics at Novant. Grant money might help subsidize costs in the trial, but that isn’t a long-term solution.
Barzilay and Yala said that adding mammograms for some younger higher-risk people will actually lower costs for insurers by helping avert expensive treatments down the road. But they acknowledged that persuading them of this will take time.
Patient adoption is also an open question. Some see a generational split, with younger people embracing an algorithm while older ones resist. “One of the big questions we get from patients over a certain age, and I certainly understand it, is whether a machine can care about them in the same way,” Lehman said.
Of course, thanks precisely to a history of age-based guidelines, younger people tend to get fewer mammograms in the first place. “I don’t want to be pessimistic about this, because the idea that we can more accurately predict five years of risk is really promising, even revolutionary,” said Kate Lampen-Sachar, a radiologist at the Miami Cancer Institute’s Baptist Health Breast Center and an adviser to the Young Survival Coalition, an advocacy group for women diagnosed with breast cancer under age 40 — a group that has seen rates rise in recent years.
“But I think it remains to be seen how easily this could be implemented,” she said. “Because in the end, it still requires a mammogram be performed. And that isn’t simple for people under 40.”
There are also regulatory challenges. The Food and Drug Administration requires that any new tool in a hospital that has not been approved go through a strict internal review process, which means many upcoming on-the-ground battles for Mirai to prove it can do more good than harm.
Barzilay said she has no choice but to press on.
“Not long after I turned 40 — about three years before I was diagnosed — I went for my first mammogram,” she said. “They told me everything was fine and there was nothing to worry about. Would Mirai have noticed whatever was happening inside me? Would it have sent me for more screening and told me to watch more closely? Would it have allowed me to catch the cancer much sooner and avoid all that treatment? There are women who will be diagnosed with breast cancer in three years. I feel a responsibility to give them Mirai now.”
Just out of a sense of dark curiosity, she recently fed that initial mammogram into Mirai. It told her she was high-risk.
Photo captions in an earlier version of this article misspelled Adam Yala's surname. They have been corrected.