The Washington PostDemocracy Dies in Darkness

Uber’s data could be a treasure trove for cities. But they’re wasting the chance to get it.

(REUTERS/Kai Pfaffenbach)

The District of Columbia passed new legislation this week legalizing services like UberX and Lyft that allow non-professional drivers with their own personal cars to compete with traditional taxis. In a sign that Uber got pretty much what it wanted out of the city, the company then held a press call Wednesday afternoon to celebrate.

"I think you’re seeing some momentum here," said David Plouffe, Uber's brand new senior vice president of policy and strategy, citing the District's legislation as a model for the rest of the country. "Maybe even Uber-mentum, if you want to be cute."

The city's new law, opposed by the taxi industry, requires Uber and its competitors to register with the D.C. Taxicab Commission and provide $1 million primary insurance coverage to drivers from the moment they accept a ride to the time they drop off a passenger. Drivers will also have to go through criminal background checks, and their cars annual inspections.

Uber no longer disputes any of these requirements — insurance, background checks, vehicle safety — as anti-innovation or unnecessary. But the District did fail to get out of Uber one thing that the company is still reluctant to give: access to its data.

Such data could be tremendously valuable to local governments, but one city after the next has been leaving it on the table. Uber amasses vast amounts of information on when and where it collects passengers and where it takes them. Anonymized versions of this data — designed to protect the privacy of individual drivers and riders — would help cities verify that Uber drivers aren't discriminating against certain neighborhoods or disabled passengers, that Uber is actually weeding out drivers who do, that the company is truly serving the public in exchange for the public's confidence in it.

This is precisely the kind of data cities already demand of taxicabs, and if we had it for UberX and Lyft, too, it would be a lot easier to ensure consumer protection.

This stuff would also be a boon for transportation planners, who spend a lot of time (and money) trying to understand the travel patterns of residents that are already passively captured by transportation apps. Uber is building a sophisticated picture of how people move around many cities — where the demand is, where people want to go, when those trips take place down to the minute. This larger picture will ultimately help Uber build its new, more complex carpooling tool. But it could also help cities plan infrastructure, manage traffic flow, and understand commuters better.

Add to all of this some anonymized payment data, and the public would have a much better idea of what kind of jobs Uber is really creating, and how it's adjusting "surge" prices during events like emergencies.

Right now, the data we do get typically comes from Uber's own occasional in-house analyses. Consumers and public officials should be skeptical of these numbers, not because Uber is a particularly dishonest company, but because selective data-sharing can never be truly transparent.

David Alpert, editor of the blog Greater Greater Washington, made a great case for all of this in the Post last month, as Washington was still considering its regulation. When the United States deregulated the airline industry in the late 1970s, it required of private airlines something very similar to what cities should require of these ground-transportation companies today:

The federal government stopped prescribing airlines’ exact routes and fares but, in addition to continuing to ensure safety, it collected data from the airlines about their routes, schedules, fares, how full the planes are, on-time performance and much more. Government officials now crunch these numbers and, more important, so do travel journalists, bloggers, watchdogs and advocates. If an airline starts doing shady things, people will know.

Cities have one golden chance to ask for this data, to set up a permanent structure where companies like Uber would hand it over regularly. That's when local governments have the most leverage over Uber — when they're deciding whether and how to legalize it. The ask must be specific, and spelled out in the regulation itself: Perhaps cities require anonymized origin-destination and time data for every trip (taken or canceled), without information on the rider or passenger, down to some larger geography like Census blocks that would preserve consumer privacy. Aggregated data about neighborhoods instead of trips wouldn't be truly transparent. And vague statutory requirements for "data sharing" will only allow these companies to shrug off the specifics later.

Uber will resist this idea, because its data is its most prized possession. As a quasi-transportation company, it doesn't actually own any cars, or infrastructure, or engineering plans, or vehicle technology. But what it does have that's made it a multi-billion-dollar company is this very fine-grained information about riders and drivers, and the systems to leverage it. Uber considers this data proprietary and private. And it's true that some of it is.

"I don’t think there are many people out there who want it spread over the Internet what time they were picked up or where," Plouffe says.

He cites the embarrassing scenario where New York City recently released taxicab usage data under a Freedom of Information request without properly anonymizing it first. As a result, you may have read about the private transportation habits of Bradley Cooper and Ashlee Simpson.

For all its sophistication, though, Uber ought to be able to figure out how to give cities the data they need while stripping out personally identifiable information. Plouffe recognizes the public value of Uber's data. The company worked with the city of New York last week to identify the Uber driver who unknowingly drove across town a passenger coming down with Ebola.

"If that person had been in a taxicab, and paid with cash, and had no receipt," Plouffe says, "there would have been a citywide manhunt for cab drivers described by physical characteristics of that driver."

Because Uber keeps a data trail of every trip, it could immediately identify the driver. It was an "interesting moment," Plouffe adds, when it became clear that Uber has this other value (the company did not give city officials the names of other passengers who'd been in the same Uber car, he said, since health officials said that wasn't necessary).

The greatest public value in the company's data, though, won't simply come from complying with authorities in an emergency, but from revealing how well it serves the public every day. If cities don't demand this now, they will eventually wind up with a growing transportation sector on which the public depends, but that operates entirely out of public view.