A new study argues that more than half of all Americans could be identified by name simply by analyzing a sample of their DNA, and knowing their age and where they live. The researchers say that once 3 million Americans have uploaded their genomes to public genealogy websites nearly everyone in the USA would be identifiable. More than 1 million Americans have already published their genetic information, and hundreds more do so every week.
This will force us all to rethink the meaning of privacy in the DNA age, say the researchers.
Rise of DNA testing
The startling claim is the result of two long-standing trends: the rise of direct-to-consumer genetic testing and the proliferation of publicly searchable genealogy databases. DNA testing companies can sequence a person’s DNA from just a sample of their saliva. It’s a simple process: samples can be provided at home and mailed back to the testing laboratory. A full genome can then be uploaded to a genealogy database, which uses powerful computers to look for stretches of matching DNA sequences to help build a family tree.
To test the hypothesis, researchers led by Columbia University computer scientist Yaniv Erlich set out to see whether they could find a person’s identity using only a piece of DNA and a small amount of biographical information. The full DNA sequence was from a person whose genetic information had been published anonymously as part of an unrelated scientific study.
When the genetic code was run through a genealogy database, two relations were found: one in North Dakota and one in Wyoming. The researchers could tell the individuals were all related because they shared a number of single nucleotide polymorphisms (SNPs). The more SNPs people share, the more closely related they are. By comparing the DNA of all three relatives, the team was able to find a common ancestral couple that were the mystery person’s great-grandparents. A scour of genealogical websites then threw up 10 children and hundreds of grandchildren and great-grandchildren. After a long day of painstaking work, the researchers were able to correctly name the owner of the DNA sample.
The same process would work for about 60% of Americans of European descent, who are the people most likely to use genealogical websites, said Erlich. Although the odds of success would be lower for people from other backgrounds, it would still be expected to work for more than half of all Americans. The findings have been published in the journal Science.
“If you can find a person’s third cousin in a genealogical database, then you should be able to identify the person with a reasonable amount of sleuthing”, said Erlich.
Law enforcement officials use publicly accessible DNA databases to help with their cases. The suspected Golden State Killer, Joseph James DeAngelo, is the most famous person to be identified this way. He has since been charged with 13 counts of murder and 13 counts of attempted kidnapping. This was only the second time in crime-solving history that the strategy was implemented successfully. Since then, at least 13 additional suspected criminals have been identified in the same way.
“When the police caught the Golden State Killer, that was a very good day for humanity”, said Erlich. “The problem is that the very same strategy can be misused.”
Foreign governments could use this technique to track down American citizens, he said. Or protesters and activists could be pursued in this way. As such, Erlich and his co-authors proposed a mitigation strategy that would make it harder to upload an unknown DNA sequence to a genealogical database and search for a match. They suggest that direct-to-consumer DNA testing companies put a special code on the raw data files they send to their customers. Genealogy sites could then agree to allow people to upload DNA sequences only if they have a valid code. This would ensure that people could conduct searches related only to their own DNA.
Rest assured: AlphaBiolabs does not store personal data including our customers’ DNA profiles. We also don’t sell on any information. To find out about our range of DNA testing services call us now on 727-325-2902 or email email@example.com.