The Washington PostDemocracy Dies in Darkness

How a researcher used big data to beat her own ovarian cancer

A visitor views a digital representation of the human genome in 2001 at the American Museum of Natural History in New York. (Mario Tama/Getty Images)
Placeholder while article actions load

Academic scientists devote their lives to research, often toiling away on problems that few people outside their discipline fully understand. Perhaps some are driven by pure curiosity or competition, while others have a personal interest in the topic at hand.

For Shirley Pepke, a genomics researcher based in Los Angeles, the urgency to find answers comes from her own instinct for survival. Since 2014, she has been working on a tool capable of tailoring ovarian cancer treatment to each patient using genomics data and a machine learning algorithm.

The first subject in this DIY precision medicine project was Pepke herself, who was diagnosed with stage IIIC ovarian cancer in September 2013.

“Some people get cancer and do fundraisers — I'm good at doing computational research on complex systems, so it seemed really natural for me to work on this,” she said. “Because I have really young children, I felt that I had to pursue every avenue to try and extend my life, and I owed it to them.”

She began her career as a physicist and data scientist, developing artificial intelligence software for NASA's launch vehicles and algorithms to analyze high-throughput genomics data at the California Institute of Technology. But her research focus abruptly pivoted after her diagnosis of ovarian cancer, which had already spread to nearby organs in her pelvis.

Since then, Pepke has taken her computational know-how, experience with genomics, and broad network of research collaborators to battle back the disease.

To start, a colleague at Caltech put her in touch with local researchers who had access to high-throughput genomic sequencing technology — a rapid and cheap method that can sequence multiple DNA or RNA molecules at once — to measure her tumor.

The researchers analyzed the DNA sequence of Pepke's cancer for possible mutations that could steer her toward a personalized treatment option, but nothing too compelling turned up. They also gave her an enormous data set containing gene expression data, which can provide information about gene activity that the genome cannot.

“Gene expression methods measure how much of the protein is going to be made, how quickly the body is transcribing the DNA in that protein, and if there are epigenetic changes or chemical modifiers that affect the rate of transcription of DNA that gets made into protein,” said Pepke. “These fall into the category of mutations that are picked up in gene expression data, which can have an effect on cancer and its response to therapies.”

However, the gene expression data set was unwieldy and difficult to sift through, even for an experienced data scientist like Pepke.

At this point, a friend connected her with Greg Ver Steeg, an assistant professor at the University of Southern California who specialized in mining complex data. In 2014, Ver Steeg developed an advanced machine learning method called Correlation Explanation (CorEx) capable of teasing out hidden patterns in large, high-dimensional data sets.

“If you observe a bunch of things all related to each other, that relationship must come about due to some hidden factor you couldn't see,” said Ver Steeg. “In human biology, there are many hidden factors — for instance, gene expression can tell us about how disease is progressing and which treatments could work.”

Ver Steeg has applied his machine learning algorithm to problems in neuroscience, psychology and finance often crippled by overwhelming amounts of data. A study published last year looked at over 200 potential biomarkers in 566 older adults with CorEx, identifying those that were most predictive of cognitive decline and brain atrophy. Online dating website eHarmony recently recruited Ver Steeg to improve its matchmaking process, in hopes that CorEx can target the hidden factors that contribute to happy relationships.

The two began to collaborate in 2014, first modifying CorEx to analyze the publicly available gene expression data from ovarian cancer patients in the Cancer Genome Atlas. Their goal was to unearth the hidden factors in the data set that correlated with patient survival. For instance, they found patients whose immune systems became activated in certain ways — seen in the data as a particular gene expression profile — had better long-term survival.

In the meantime, Pepke had gone through surgery and standard front-line chemotherapy shortly after her diagnosis, but her cancer recurred in January 2015. Her doctors offered up a menu of second-line therapies, but none appeared any better or worse than the other with the limited knowledge available — an issue that CorEx and genomic data could help solve.

“When women recur, they have to ultimately make a choice of what their therapy will be, and it's really like throwing darts at a dart board,” said Pepke. “Women are just cycling through ineffective therapies, and they're incredibly toxic. It would be huge to get the best, most effective therapy off the bat.”

Based on the CorEx results and her own tumor's data, Pepke went against her oncologist's recommendation of standard chemotherapy after her recurrence. Instead, she chose an immunotherapy drug not yet approved for ovarian cancer.

“From Shirley's point of view, she thought it would be great to use our findings to choose her own treatment, since her doctors didn't seem to have a good reason to choose one treatment over another,” said Ver Steeg. “With our research in gene expression, perhaps we could make a more informed choice.”

After immunotherapy, she went through one more round of surgery and chemotherapy. Two months later, no signs of her tumor could be found, and her MRI was clear. As of now, Pepke's cancer has been in remission for a year.

“In the end, no one can say what happened or which treatment it responded to,” she said. “It's certainly a possibility that the immunotherapy had an impact, and the disease progression was very consistent with an immunotherapy-type response.”

Pepke and Ver Steeg recently published a preprint of their work with CorEx and are working to confirm the results in a larger population of patients. Ultimately, their goal is to have all women with ovarian cancer — not just those with scientific expertise and research connections — reap the benefits of precision medicine and tailored treatment options.

“I have a great deal of hope for the future, seeing as things in cancer research are changing so quickly, and the field is learning so much,” said Pepke. “While cancer may not be cured in five years, the landscape of treatment will be very different. I just hope to be around to see what happens.”

Read more:

Cancer immunotherapy is moving fast. Here’s what you need to know.

Brain cancer replaces leukemia as the leading cause of cancer deaths in kids

Mark Zuckerberg and Priscilla Chan’s $3 billion effort aims to rid world of major diseases by end of century

Like our Health & Wellness page on Facebook for more news about the ins and outs of the human body and mind, essays and advice. You can sign up here for our newsletter.