The Washington PostDemocracy Dies in Darkness

Dirty data: Why the ‘4 million public comments’ on net neutrality might not be what they seem

President Obama shakes hands with then-nominee for Federal Communications Commission, Tom Wheeler, at the White House last year. (AP Photo/Jacquelyn Martin, File)

When President Obama last month came out in favor of a far-reaching plan to ensure that all bits of content on the Internet are treated equally, he cited the unprecedented number of comments from the American public that had poured into the nation's main telecommunications agency in apparent support of that approach.

"I am asking the Federal Communications Commission to answer the call," Obama said in his debate-shifting statement, "of almost 4 million public comments, and implement the strongest possible rules to protect net neutrality."

Now, more than a month later, there's a fight roiling telecom policy circles this week over whether there were, in fact, nearly 4 million comments sent into the FCC, and how many of them actually were in support of those "strongest possible" rules.

What's still reasonably certain is that President Obama indeed exists.

Sparking this dust-up was an analysis from the nonpartisan Sunlight Foundation of the FCC's release of that comment cache. Sunlight concluded that not only were there several hundred fewer comments in that collection than the FCC had announced, but the majority of those in the second, "reply comment," round it has now analyzed had weighed in against an Obama-style plan.

Those lined up against net neutrality regulation, particularly a group called American Commitment, took a victory lap. "We were engaging in the same sort of 'clicktivism' that we saw from the liberal advocacy groups," said Phil Kerpen, president of American Commitment, "because the views of the American people weren't being reflected in the first round."

"It's clear that Phil and his organization weren't present in first round and showed up in the second round," acknowledged Tim Karr, senior director of strategy at Free Press, a group that has pushed aggressively for the sort of strict net neutrality rules advocated by Obama.

That said, some of the pro-regulation groups, including Karr's Free Press, Fight for the Future and Demand Progress, cried foul on the specifics of Sunlight's analysis. Using the same data download that Sunlight had used, those organizations searched for some of the comments they knew their supporters had submitted, and noticed that some were missing.

The Sunlight Foundation responded late Wednesday with a clarification of its analysis, calling on the FCC to explain why there seemed to now be 1.1 million missing comments from the data download. The pro-net neutrality coalition responded by insisting that the Sunlight analysis has misinterpreted even the comments that actually were contained within the cache.

There are additional wrinkles to the story involving e-mail duplication, bounce-backs, signatures vs. comments, and data noise that will only drive you to search for a bottle of bourbon or soothing bar of chocolate.

The bigger question is how, exactly, does such a data mess happen in the year 2014?

In short, by relying upon a system built for the 1990s. The Electronic Comment Filing System was launched in the Clinton administration as a way of using the Internet to open up the FCC's important decision-making process. But that aged apparatus couldn't keep up with the numbers game that the net neutrality debate turned into this summer and fall.

And so, during the final days of the net neutrality public comment phase, the FCC scrambled to offer alternatives for letting the public file comments -- through e-mail submissions that were turned into PDFs, via the direct uploading of spreadsheets full of comments. As far as we know, every comment was captured. But they were a mess of shapes and sizes and formats.

Imagine, if you will, trying to count a few million cows contained within one giant pen. If you're clever about this sort of thing, it might be possible to come up with a formula for calculating how many there are.

But now imagine that you have to come up with a way of counting a couple millions cows, a million butterflies, several hundred thousand blades of grass, and a few hundred clumps of mud. That becomes a lot less automated and a lot more difficult.

When that count is there to serve as a rough estimate of public opinion, that's one thing. But it's another thing when it becomes the basis of public policy on an issue where there is strong disagreement.

That said, the comment debacle has managed to achieve consensus in a debate where there has been so little: Just about everyone involved -- the pro-neutrality folks, the anti-regulation people, even the Federal Communications Commission -- now agrees that as a mechanism for capturing popular sentiment on hugely important issues, this is a system in serious need of an upgrade.

The FCC has asked Congress for extra money to rebuild its public commenting system. So far, Congress has declined.

Have more to say about this topic? Join us today for our weekly live chat, Switchback. We'll kick things off at 11 a.m. Eastern. You can submit your questions now, right here.