By Brian Krebs
Washington Post Staff Writer
Monday, July 6, 2009 6:05 PM
Researchers have found that it is possible to guess many -- if not all -- of the nine digits in an individual's Social Security number using publicly available information, a finding they say compromises the security of one of the most widely used consumer identifiers in the United States.
Many numbers could be guessed at by simply knowing a person's birth data, the researchers from Carnegie Mellon University said.
The results come as concern grows over identity theft and lawmakers in Washington push legislation that would bar businesses from requiring people to supply their Social Security number when purchasing a good or service.
"Our work shows that Social Security numbers are compromised as authentication devices, because if they are predictable from public data, then they cannot be considered sensitive," said Alessandro Acquisti, assistant professor of information technology and public policy at Carnegie Mellon University, and a co-author of the study.
A Social Security Administration spokesman said the government has long cautioned the private sector against using a Social Security number as a personal identifier, even as it insists "there is no fool proof method for predicting a person's Social Security Number."
"For reasons unrelated to this report, the agency has been developing a system to randomly assign SSNs," which should make it more difficult to discover numbers in the future, Mark Lassiter, a spokesman for the Social Security Administration, said by e-mail.
Introduced in the 1930s as a way to track individuals for taxation purposes, Social Security numbers were never designed to be used for authentication. Over time, however, private and public institutions began keeping tabs on consumers using the numbers, requiring people to present them as proof of identity, such as when applying for loans, new employment, or health insurance.
Concern over the privacy of those numbers has grown in the wake of hundreds of data breaches reported by businesses, governments and educational institutions, breaches that have exposed millions of consumer records -- including SSNs.
In recent years, a number of states have passed legislation to redact or remove the numbers from public documents, such as divorce and property records, and bankruptcy filings. In addition, legislation introduced this year by Rep. Rodney Frelinghuysen (R-N.J.) and Sen. Dianne Feinstein (D-Calif.) would prohibit the display, sale, or purchase of Social Security numbers without consent, and would bar businesses from requiring people to provide their number.
The researchers at Carnegie Mellon set out to see if they could discover people's numbers by first exploiting what is publicly known about how the numbers are derived.
The Social Security number's first three digits -- called the "area number" -- is issued according to the Zip code of the mailing address provided in the application form. The fourth and fifth digits -- known as the "group number" -- transition slowly, and often remain constant over several years for a given region. The last four digits are assigned sequentially.
As a result, SSNs assigned in the same state to applicants born on consecutive days are likely to contain the same first four or five digits, particularly in states with smaller populations and rates of birth.
As it happens, the researchers said, if you're trying to discover a living person's SSN, the best place to start is with a list of dead people -- particularly deceased people who were born around the time and place of your subject. The so-called "Death Master File," is a publicly available file which lists SSNs, names, dates of birth and death, and the states of all individuals who have applied for a number and whose deaths have been reported to the Social Security Administration.
CMU researchers Acquisti and Ph.D student Ralph Gross theorized that they could use the Death Master File along with publicly available birth information to predict narrow ranges of values wherein individual SSNs were likely to fall. The two tested their hunch using the Death Master File of people who died between 1972 and 2003, and found that on the first try they could correctly guess the first five digits of the SSN for 44 percent of deceased people who were born after 1988, and for 7 percent of those born between 1973 and 1988.
Acquisti and Gross found that it was far easier to predict SSNs for people born after 1988, when the Social Security Administration began an effort to ensure that U.S. newborns obtained their SSNs shortly after birth.
They were able to identify all nine digits for 8.5 percent of people born after 1988 in fewer than 1,000 attempts. For people born recently in smaller states, researchers sometimes needed just 10 or fewer attempts to predict all nine digits.
Records of an individual's state and date of birth can be obtained from a variety of sources, including voter registration lists and commercial databases. What's more, many people now self-publish this information as part of their personal profiles on blogs and social networking sites. Indeed, the researchers tested their method using birthdays and hometowns that CMU students published on social networking sites, with similar results.
Privacy and security experts praised the Carnegie Mellon study, saying it should be a wake-up call to policy makers and industry leaders, many of whom have resisted switching to a more secure consumer authentication system due to the sheer cost of changing the current system.
"We can't pretend anymore that SSNs can be kept secret," said Peter Swire, a law professor at Ohio State University and chief counselor for privacy during the Clinton administration. "This report puts a nail in that coffin. We'll need new approaches, and it will cost money for the government and the private sector to build the new approaches."
Ross Anderson, a professor of security engineering at Cambridge University, said the findings suggest that businesses using SSNs as a password are being negligent, and should find other ways of verifying the claims to identity that are being made by their customers.
"Sure, the study says that if you were born in a big state on a busy day you're probably still safe," from having identity thieves guess your entire SSN, Anderson said. "Still, I think many people would find it unacceptable that a system continues in use which in effect exposes tens of millions of Americans to fraud and other kinds of harm."
Linda Foley, founder of the Identity Theft Resource Center, a San Diego based nonprofit, cited another potential problem. She said many businesses have errantly rely upon or have moved to redact all but the last four digits of a person's SSN, the very digits that are most unique to an individual.
"Because of the way the SSN has been designed, asking for the last four numbers of the SSN puts people at risk because those are the only numbers that are unique to you and cannot be guessed easily by someone who might want to use your identity," Foley said.
The National Science Foundation, the U.S. Army Research Office, Carnegie Melon Cylab, and the Berkman Faculty Development Fund provided support for the research. The study, which will be presented July 29 at the BlackHat 2009 security conference in Las Vegas, is available at this link.