The Washington PostDemocracy Dies in Darkness

Fact-checking Mark Zuckerberg’s testimony on Facebook and data collection

Facebook's CEO testified for hours before Congress this week, but his statements weren't always accurate. (Video: Meg Kelly/The Washington Post)

Facebook chief executive Mark Zuckerberg took nearly 600 questions on Capitol Hill, giving a rare window into his views on some of the thorniest issues online.

What kinds of data does Facebook collect about its users? Who owns that data? What does Facebook do with it? And how does Facebook keep it safe and private?

Testifying about these issues in the Senate on April 10 and the House on April 11, Zuckerberg chose his words carefully, dodged or referred questions to his “team,” or gave only partial answers. The responses he did give were in some cases misleading because they lacked relevant information that could cast Facebook in an unflattering light.

We found some of the missing links. For this roundup, as is our custom, we won’t be awarding Pinocchios.

“We have a ‘download your information’ tool. We’ve had it for years. You can go to it in your settings and download all of the content that you have on Facebook.”

Lawmakers wanted to know how much data Facebook collects about its users, and whether users have a way to see all the data Facebook gathers about them.

We’re really talking about two kinds of data here. The first kind is “content” — the photos, videos, status updates, news articles and other baubles that Facebook users are posting for their friends, or the whole world, to see.

The second kind of data goes behind the curtain. It includes users’ location information, their Web browsing history, and the inferences that Facebook draws about them to tailor the kinds of ads they see. For example, Facebook might infer a user’s ethnicity and political affiliation and use those inferences to show more relevant ads.

Lawmakers often asked about this second kind of data, which Facebook keeps for ad-targeting and security purposes. Zuckerberg would often steer the conversation back to the first kind of data, user-generated content. He said users could download “all of the content” they had on Facebook with a special tool called “download your information.”

“Is there any other information that Facebook has obtained about me, whether Facebook collected it or obtained it from a third party, that would not be included in the download?” Rep. Jerry McNerney (D-Calif.) asked.

“Congressman, my understanding is that all of your information is included in your ‘download your information,’” Zuckerberg said.

One of our newsroom colleagues tried the “download your information” tool (DYI) and couldn’t find his detailed location data, Web-browsing history or the inferences Facebook drew about him to target ads. We asked Facebook about it and were told that the download tool currently “includes the personal data that people provide directly to Facebook” and that the company will be adding more items to the list.

“In the last year, we’ve started showing more information in DYI and regularly update the information it includes,” a Facebook representative said. “We also in late March announced new ways to view, access and manage information Facebook has about you. DYI will now include your comments, likes and reactions, as well as search history and location history. You can see these today in Activity Log, but they hadn’t been part of Download Your Information. We’ll add more information to this tool as we build new products and hear from people about what they’d find most useful.”

For the record, Facebook’s data policy says the company collects information about its users’ devices, such as “the operating system, hardware version, device settings, file and software names and types, battery and signal strength, and device identifiers.” None of that is mentioned on the list of data Facebook discloses to users.

Facebook also gathers data on “device locations, including specific geographic locations, such as through GPS, Bluetooth, or WiFi signals.” And it collects “connection information such as the name of your mobile operator or ISP, browser type, language and time zone, mobile phone number and IP address.” The list of data Facebook discloses to users mentions only a limited set of IP address information.

Facebook also gathers detailed data for advertisers, which isn’t mentioned on the list. “For example, we may … provide non-personally identifying demographic information (such as 25 year old female, in Madrid, who likes software engineering) to these partners to help them understand their audience or customers, but only after the advertiser has agreed to abide by our advertiser guidelines,” Facebook’s data policy says.

Describing what Facebook does with users’ browsing history, Zuckerberg said: “We only store them [Web logs] temporarily, and we convert the Web logs into a set of ad interests, that you might be interested in those ads, and we put that in the ‘download your information’ instead, and you have complete control over that. So I just wanted to clarify that one for the record.”

Translation: Users have a way to see and customize some of the products and interests that Facebook has associated with them for ad-targeting purposes. But if there’s a way to see the underlying data that allowed Facebook to form these inferences in the first place, we couldn’t find it, Zuckerberg did not say, and Facebook did not directly answer our question.

“The ‘download your data’ tool is not as comprehensive as Zuckerberg’s testimony implies,” said Gennie Gebhart, a researcher at the Electronic Frontier Foundation. She acknowledged that users can customize some of the ad-targeting assumptions Facebook keeps for them. But she added, “It is reasonable to expect that there are other inferences that Facebook might have that are not available to you.”

For example, for one week in January 2012, Facebook conducted an experiment on 689,003 randomly selected users that altered the quantity of positive or negative posts in their news feeds. Researchers then monitored those users’ Facebook activities to see how the change affected their moods. The only way they could have conducted this experiment was by collecting data on users’ moods, Gebhart said.

“All the data that you put in, all the content that you share on Facebook, is yours. You control how it’s used. You can remove it at any time. You can get rid of your account and get rid of all of it at once.”

Who owns all this data? Is it the user or Facebook, or both?

Zuckerberg was asked a few variations on this question and his answer was mostly consistent: the user. The user has all the control. Facebook is just along for the ride.

Some lawmakers weren’t buying this. “Your business model is to monetize user information to maximize profit over privacy,” Sen. Richard Blumenthal (D-Conn.) told Zuckerberg. (The Washington Post’s technology columnist, Geoffrey A. Fowler, wasn’t buying it, either.)

Others asked whether users should be able to reap the profit from their data instead of Facebook. “You talk about ‘the user owns the data,’ you know, there are a number — have been a number of proposals of having that data stay with the user and allow the user to monetize it themselves,” said Sen. Ron Johnson (R-Wis.).

This issue is wrapped in legalese. Users technically own the content they post online. “You own all of the content and information you post on Facebook,” the company’s user agreement says. However, this doesn’t appear to cover the separate set of ad-targeting data Facebook generates.

When it comes to user-generated content such as photos and videos, Facebook’s terms of service spell out conditions that effectively give the company joint ownership.

“For content that is covered by intellectual property rights, like photos and videos (IP content), you specifically give us the following permission, subject to your privacy and application settings: you grant us a non-exclusive, transferable, sub-licensable, royalty-free, worldwide license to use any IP content that you post on or in connection with Facebook (IP License). This IP License ends when you delete your IP content or your account unless your content has been shared with others, and they have not deleted it.”

What happens when a user deletes something? “When you delete IP content, it is deleted in a manner similar to emptying the recycle bin on a computer,” Facebook’s user agreement says. “However, you understand that removed content may persist in backup copies for a reasonable period of time (but will not be available to others).” A Facebook representative added that “it may take up to 90 days to delete all of your account information from our backup systems.”

Zuckerberg later expanded on his answer: “You own it in the sense that you chose to put it there, you could take it down anytime, and you completely control the terms under which it’s used. When you put it on Facebook, you are granting us a license to be able to show it to other people. I mean, that’s necessary in order for the service to operate.”

“Two weeks ago, we found out that a feature that lets you look someone up by their phone number and email was abused. This feature is useful in cases where people have the same name, but it was abused to link people’s public Facebook information to a phone number they already had. When we found out about the abuse, we shut this feature down.”

We’re quoting from Zuckerberg’s written testimony, although this issue also came up in the hearings. Facebook announced on April 4 that it was restricting access to some kinds of user data. This included shutting down a tool that could search for people based on their phone number or email address.

“This has been especially useful for finding your friends in languages which take more effort to type out a full name, or where many people have the same name,” Facebook chief technology officer Mike Schroepfer wrote in a blog post. “In Bangladesh, for example, this feature makes up 7 percent of all searches. However, malicious actors have also abused these features to scrape public profile information by submitting phone numbers or email addresses they already have through search and account recovery.”

Zuckerberg said Facebook learned that this tool had been abused only two weeks before his testimony. But there is ample evidence that users and researchers had been calling Facebook’s attention to the system vulnerability at the root of the issue since at least 2013.

Brandon Copley, the chief executive of Giftnix, said he was threatened with legal action when he used this Facebook search tool to demonstrate how it could be used to scrape data, Wired reported. A group of researchers raised the issue again in 2015. Facebook reportedly told them it did not consider this a security vulnerability.

Rep. Ben Ray Luján (D-N.M.) asked Zuckerberg: “Yes or no: In 2013, Brandon Copley, the CEO of Giftnix, demonstrated that this feature could easily be used to gather information at scale. Well, the answer to that question is yes. Yes or no: This issue of scraping data was again raised in 2015 by a cybersecurity researcher, correct?”

Zuckerberg said, “Congressman, I’m not specifically familiar with that. The feature that we identified — I think it was a few weeks ago, or a couple weeks ago, at this point — was a search feature that allowed people to look up some information that people had publicly shared on their profiles.”

Asked about this issue, a Facebook representative said: “Mark was referring to specific incidents that we identified two weeks ago. In the past, we have been aware of scraping as an industry issue, and have dealt with specific bad actors in the past.”

“In the Kogan case, people signed into that app expecting to share the data with Kogan, and then he turned around and, in violation of our policies and in violation of people’s expectations, sold it to a third-party firm.”

A Cambridge University researcher named Aleksandr Kogan created a personality-quiz app for Facebook in 2014. He was able to collect data on up to 87 million Facebook users through that app, according to Facebook’s most recent estimate, and then sold the data to Cambridge Analytica, a political data firm hired by Donald Trump’s presidential campaign in 2016.

The vast majority of these 87 million users had not even downloaded Kogan’s app; their data was exposed merely because they were friends with 270,000 or so Facebook users who did download the app.

Zuckerberg said Kogan sold the data “in violation of our policies.” But it’s not as clear-cut as he says. Facebook had two conflicting sets of rules in play.

Facebook’s terms of service for developers prohibited what Kogan did: “Only use friend data (including friends list) in the person’s experience in your app. Don’t sell, license, or purchase any data obtained from us or our services. Don’t transfer any data that you receive from us (including anonymous, aggregate, or derived data) to any ad network, data broker or other advertising or monetization-related service.”

But Facebook’s terms of service for users said another thing, stipulating that when a user granted access to a third-party app, “your agreement with that application will control how the application can use, store and transfer that content and information.” In turn, the terms of service for Kogan’s personality-quiz app asked users for the right to sell, transfer, store and license their data in perpetuity and for any use.

“I want to show you the terms of service that Aleksandr Kogan provided to Facebook and note for you that, in fact, Facebook was on notice that he could sell that user information,” Blumenthal told Zuckerberg. The Facebook chief said “it certainly appears that we should have been aware that this app developer submitted a term that was in conflict with the rules of the platform” but added that no one was fired over this lapse.

A Facebook representative said third-party app developers could not “override, modify or supersede” Facebook’s policies through their own privacy disclosures.

“Anyone can turn off and opt out of any data collection for ads, whether they use our services or not.”

In some cases, Facebook collects data about its users even when they’re not on Facebook’s website or apps. For example, third-party websites that feature Facebook’s “Like” button send some user data back to Facebook. Although the process is slightly different, “Share” buttons on websites and other tools essentially do the same thing.

“We receive information about you and your activities on and off Facebook from third-party partners, such as information from a partner when we jointly offer services or from an advertiser about your experiences or interactions with them,” Facebook’s data policy says.

The upshot here is that Facebook hoovers up a bunch of data about people who aren’t Facebook users, creating what are known as “shadow profiles.”

Zuckerberg said these Facebook-abnegators may “turn off and opt out of any data collection for ads.” But how would that even work, since they’re not Facebook users? Do they send postcards?

One option: “You turn it off by making a Facebook account,” said Gebhart, of the Electronic Frontier Foundation. (This, of course, is counterintuitive and defeats the purpose for those who want to avoid Facebook.)

Facebook’s website gives an email address for non-users to request their data, and there’s also an online form and a separate page with instructions on how to use third-party tools to turn off online ads or tracking. This isn’t really “opting out” in the sense that Facebook provides the option itself, and it’s not clear how these “shadow profile” individuals could have Facebook delete their data without first creating an account.

An ACLU technologist, Daniel Kahn Gillmor, wrote that Facebook compiled a lot of data about him – including “which news articles I was reading, my dietary preferences, and my hobbies” — even though he never signed up for a Facebook account.

(About our rating scale)

Send us facts to check by filling out this form

Keep tabs on Trump’s promises with our Trump Promise Tracker

Sign up for The Fact Checker weekly newsletter

Share the Facts
Washington Post rating logo Washington Post Rating:
Not the whole story
“We have a ‘download your information’ tool. We’ve had it for years. You can go to it in your settings and download all of the content that you have on Facebook.”
in a House hearing
Wednesday, April 11, 2018
Share the Facts
Washington Post rating logo Washington Post Rating:
Not the whole story
"All the data that you put in, all the content that you share on Facebook, is yours."
in a House hearing
Wednesday, April 11, 2018
Share the Facts
Washington Post rating logo Washington Post Rating:
Not the whole story
“In the [Aleksandr] Kogan case, people signed into that app expecting to share the data with Kogan, and then he turned around and, in violation of our policies and in violation of people’s expectations, sold it to a third-party firm.”
in a House hearing
Wednesday, April 11, 2018