The general’s mistress thought she was being clever by using anonymous e-mail accounts and sending messages using hotel WiFi networks. But metadata — in this case the Internet protocol addresses pointing to network locations — gave her away.
The IP addresses of the networks Paula Broadwell logged into this past fall to send threatening messages to a woman she perceived as a rival for the affection of Gen. David H. Petraeus traced back to the hotels. There, records corresponding to the dates the e-mails were sent revealed one common guest: Broadwell.
Petraeus resigned as CIA director over the affair, and the episode has since receded from the public’s attention. But it is instructive as one simple but powerful way in which metadata — or data about communications — can reveal so much about who we are, where we go and whom we associate with.
Metadata is so rich with clues that entities from Google and eBay to the world’s largest spy agency, the National Security Agency, are collecting and mining this deceptively innocuous information: e-mail addresses to and from, times of e-mails, phone numbers dialed and received, lengths of calls, unique device serial numbers.
A week and a half ago, U.S. officials acknowledged for the first time that the NSA since 2006 has been amassing a database of metadata on the phone-call records of tens of millions of U.S. customers.
And, according to new documents obtained by The Washington Post, the NSA until 2011 gathered e-mail and other digital metadata from major Internet data links, presumably to detect and thwart terrorist plots.
But the government has resisted explaining its legal justification for gathering such massive amounts of data, which hold the potential to permit vast intrusions into the personal lives of Americans.
“When you can get it all in one place and analyze the patterns, you can learn an enormous amount about the behavior of people,” said Daniel J. Weitzner, a principal research scientist at MIT’s Computer Science and Artificial Intelligence Laboratory.
Analysts can gain clues to sleep patterns (when people are asleep, they send no e-mails and make no calls), religion (based on locations of calls made or the absence of communications on the Sabbath) or even social position (based on how often people get calls and e-mails and how quickly they receive responses).
In 2007, researchers at Columbia University were able to identify the senior-most company officers at the bankrupt Enron Corp. by studying individual e-mail volume and average response time in 620,000 company e-mails. The highest-ranking officers got the most e-mail and the quickest responses.
Similarly, federal agents use software and social-network analysis to map out terrorist cells and criminal groups. They look, for instance, at who calls whom most frequently, in a technique known as “link analysis.”
“It’s remarkable how just the phone-call data can give you at least a preliminary picture of how the organization operates and who its members are,” said Jason Weinstein, a former deputy assistant attorney general for the Justice Department’s criminal division. “It’s by no means the whole picture, but it’s a critical piece of the puzzle to solve the most serious crimes people can commit.”
Sometimes, metadata patterns can be tip-offs — a driver or courier in a terrorist cell or criminal group may be the one to receive short phone calls from several different operatives just before and after an operation.
Cellular-tower location data can help place criminals at the scene if they are using their phones just before they commit a robbery, murder or attack.
“Every day, law enforcement officers are using this data to place suspects at the scene of murders and other crimes,” said Weinstein, now a partner at Steptoe & Johnson.
Data about a communication may be just as revealing as the content itself, said Christopher Soghoian, principal technologist with the American Civil Liberties Union.
“If you call an abortion clinic and make an appointment, the fact that you’re making the appointment is far more sensitive than what time your appointment is,” he said. “If you’re calling Alcoholics Anonymous or a suicide counselor, what you’re saying will certainly be sensitive. But the fact that you’re calling Al Anon or a suicide counselor is extremely sensitive, too.”
Under U.S. law, it’s easier for the government to obtain metadata than content. Authorities generally need to show probable cause for a wiretap or intercept of communications.
Telephone records, but not e-mail metadata, can be obtained by law enforcement agencies without any kind of court order.
Weitzner said metadata is “arguably more revealing because it’s actually much easier to analyze the patterns in a large universe of metadata and correlate them with real-world events than it is to go through a semantic analysis of all of someone’s e-mail and all of someone’s telephone calls, if you could get that.
“Metadata is objective: I called you. You called me.”
Cellphone data helped Italian authorities identify CIA agents who abducted an Egyptian cleric suspected of terrorist involvement in Milan in 2003. The investigators pulled the records and identified the agents by their aliases, where they had stayed and whom they had called — including each other. Similarly, in 2011, Hezbollah identified a half-dozen CIA informants through analysis of their cellphone records and calling patterns.
Critical as metadata is, Weinstein said, it does not give you the subject’s words and thoughts. “Only the content,” he said, “will provide you with the evidence you need that the conversations are about terrorism or other crimes.”