The genetic sleuthing approach that broke open the Golden State Killer case could potentially be used to identify more than half of Americans of European descent from anonymous DNA samples, according to a provocative new study that highlights the unintended privacy consequences of consumer genetic testing for ancestry and health.
The idea that people who voluntarily spit into a tube and share their genetic data online to search for relatives could unwittingly aid law enforcement was thrust into the spotlight recently. This spring, genetic genealogy helped California police identify a suspected serial killer and rapist in a grisly, decades-old cold case. But the new study, published in the journal Science, drives home the reality that that instance was not an outlier; a majority of Americans of European descent could be matched to a third cousin or closer using an open-access genetic genealogy database.
“Each individual in the database is like a beacon of genetic information, and this beacon illuminates hundreds of individuals — distant relatives connected to this person via their family tree,” said Yaniv Erlich, the chief science officer of the direct-to-consumer genetics company MyHeritage, who led the study.
Erlich and colleagues then showed how a match, combined with basic information such as age and a reconstructed family tree, could be used to figure out the identity of an anonymous person who participated in a research project. A separate study found that even the minimal DNA kept in law enforcement databases could be cross-referenced with consumer genetic data to identify relatives.
“This really brings us to the crossroads of where science and technology and law and policy and ethics meet,” said Frederick Bieber, a medical geneticist at Brigham and Women’s Hospital who consults with crime labs and public defenders’ offices. “Both of these papers are very important because they ... raise the issue that we, collectively, are beginning to face head-on: Where do our privacy expectations interfere with the natural social instinct for public safety?”
The public is overwhelmingly supportive of police searches of genetic websites to solve violent crimes, according to a recent survey published in PLOS Biology. Leading consumer genetics companies signed on to guidelines to be transparent about how people’s data is used, and many have policies that do not allow law enforcement to search their databases without explicit approval. The website commonly used in law enforcement cases, GEDmatch, is an open-access genetic genealogy database, and people must voluntarily decide to upload their genetic profile.
“We at GEDmatch are very concerned about the proper use of genealogical information,” Curtis Rogers, co-administrator of GEDmatch, wrote in an email, adding that the new finding merits “serious consideration.”
Despite company efforts to reassure and educate consumers by explaining their policies, ethicists and some researchers still worry that people who receive a consumer genetic test as a casual holiday present may not be thinking about what the data could ultimately reveal — not only about themselves, but about distant family members.
Erlich said that early in his academic career, leaders in the field warned him that studying privacy and genetics was a “dangerous career path” and urged him to study something more conventional.
“Now it’s the opposite. Everybody understands genetic privacy is important. It’s not something we need to sweep under the rug — policymakers need empirical evidence about what can and cannot happen, ways to mitigate the risk, track how it changes,” Erlich said.
Erlich and colleagues found that a genetic database containing just 2 percent of a target population could lead to a third cousin or closer match for nearly anyone. They wrote that “the technique could implicate nearly any U.S. individual of European descent in the near future.” Most people whose DNA is in genealogy databases are of European ancestry.
But CeCe Moore, an investigative genetic genealogist with Parabon NanoLabs, who works to solve real criminal cases, said that the study grossly oversimplifies the difficult work of using a match to find someone’s identity. While she agrees that a sizable proportion of the U.S. population can be matched to at least a second or third cousin in an open genealogy database, getting from a match to an identity is far from trivial, especially when real cases would often lack the critical demographic information — such as the age and family tree — that the academic researchers used in their sample case.
“They capture the power of genetic genealogy, but not really the complexities of doing the actual work,” Moore said. “They make a lot of assumptions that aren’t in line with reality; it seems they’re assuming some head-starts we don’t necessarily have in our work.”
The high-profile use of genetic genealogy to identify violent criminals this summer led to rampant speculation about how else it might be used — to invade people’s medical privacy, to track the identity of undercover agents or in searches by law enforcement or immigration officials in ways that may be more morally ambiguous to some than finding a killer.
Ethicists said that greater awareness of the possibilities and limitations of the technology is necessary, given that many people don’t realize that a public DNA profile contains information not just about one person but contains a family secret that connects to hundreds of other people. A sibling shares half of your genetic profile. A cousin shares an eighth.
“By making this real, and by making people understand just how interconnected we are by our genetics, and how skilled investigators could use these — with a fairly high success rate — to find second and third cousins or even closer relatives, underlines the power of this new technology and really brings home the reality of it,” said Benjamin Berkman, a bioethics researcher at the National Institutes of Health.