A driver rides hands-free in a Tesla Motors Inc. Model S vehicle in New York. (Christopher Goodney/Bloomberg)

Since a Tesla car on autopilot had a fatal crash in May, killing the driver, we’ve been hearing debates about whether the company’s “self-driving” cars are safe. On October 4, California’s regulators asked Tesla to stop advertising its cars using the word “autopilot” until and unless the car could actually drive itself, without human intervention. (Right now, its autopilot system requires drivers to stay awake and attentive.)

Tesla’s response to the California Department of Motor Vehicles was just a little bit testy. Here’s what the company said, according to the San Francisco Chronicle:

Tesla is reviewing the draft regulations and will provide input to the DMV as appropriate. Autopilot makes driving safer and less stressful, and we have always been clear that it does not make a car autonomous any more than its namesake makes an aircraft autonomous.

It’s just the latest verbal skirmish between the company and the world. In July, the company’s founder and CEO Elon Musk wrote an email to Fortune Magazine claiming that if the car’s autopilot system were more widely used, it would already be saving hundreds of thousands of lives:

Indeed, if anyone bothered to do the math (obviously, you did not) they would realize that of the over 1M auto deaths per year worldwide, approximately half a million people would have been saved if the Tesla autopilot was universally available. Please, take 5 mins and do the bloody math before you write an article that misleads the public.

Musk is not the only one making this argument. Numerous journalists for publications like Vanity Fair or Vox’s The Verge have argued that Tesla’s Autopilot is safer than traditional driving. Chris Ziegler wrote something at The Verge that is fairly representative:

Tesla says that Autopilot has driven 130 million miles in owners’ vehicles, now with one fatality; that compares to a U.S. average of one vehicular fatality every 94 million miles. So yes, it is statistically doing better than average, but there’s an expectation — however fair or unfair — that the computers can and should be perfect.

But is Tesla’s Autopilot “better than average?” Has it prevented more accidents than it caused? That’s definitely possible. Features like adaptive cruise control and lane departure warning will almost certainly make driving much safer at some point in the future.

Could it do so immediately? That question is far more complicated. While Musk’s conclusion may or may not be sound, the argument he and his supporters use to back it up clearly is not statistically significant.

First, consider sample size.

For rare events such as fatal car crashes, the amount of data required to come up with an accurate estimate can be enormous. A RAND Corporation paper issued in April says, “Autonomous vehicles would have to be driven hundreds of millions of miles and sometimes hundreds of billions of miles to demonstrate their reliability in terms of fatalities and injuries.”

In other words, those 130 million miles are a very good sign. But statistically speaking, you can no more use them to declare Autopilot is safer than traditional driving than you could use the results of one patient to declare a drug effective.

Then consider whether the sample is representative. 

And size is only one consideration. Even with billions of observations, you can still get a biased estimate if you have an unrepresentative sample.

Statistics books are filled with examples of this kind of apples to oranges fallacy, such as various (usually tongue-in-cheek) attempts to argue that you are safer in a combat zone than in civilian life, based on comparisons of mortality rates of soldiers and civilians. The flaw in the argument is that, combat injuries aside, soldiers tend to be young, physically fit, with almost no serious pre-existing medical conditions.

Just as soldiers are an unrepresentative sample of the population’s health, those 130 million miles may be unrepresentative about road safety.

First, we have what statisticians call self-selection issues. The people who chose to own Teslas were not representative of all drivers. Those drivers chose to engage Autopilot at times that were not necessarily representative of typical driving conditions — particularly if drivers were following Tesla’s own warning that

Traffic-Aware Cruise Control is particularly unlikely to operate as intended [when] The road has sharp curves. Visibility is poor (due to heavy rain, snow, fog, etc.). Bright light (oncoming headlights or direct sunlight) is interfering with the camera’s view.

The MIT Technology Review’s Tom Simonite spoke with experts who strongly objected to generalizing from this sample, and wrote:

“It has no meaning,” says Alain Kornhauser, a Princeton professor and director of the university’s transportation program, of Tesla’s comparison of U.S.-wide statistics with data collected from its own cars. Autopilot is designed to be used only for highway driving, and may well make that safer, but standard traffic safety statistics include a much broader range of driving conditions, he says.

Tesla’s comparisons are also undermined by the fact that its expensive, relatively large vehicles are much safer in a crash than most vehicles on the road, says Bryant Walker Smith, an assistant professor at the University of South Carolina. He describes comparisons of the rate of accidents by Autopilot with population-wide statistics as “ludicrous on their face.”

Many journalists are missing the statistical context

There is a much deeper issue here than just another case of bad statistics in a news story. Since well before the accident, transportation researchers, statisticians, and traffic safety experts have been pointing out that the mile-to-mile comparisons reported by companies like Tesla and Google are not valid and that the data collected did not support the conclusions the companies were drawing. Most journalists have failed to report this essential context.

Journalists have never been more eager to throw statistics at their readers. But data without proper statistical context can mislead as often as it informs. Along similar lines, stories on science and technology are fatally incomplete unless they put the new findings in the larger context of existing research and let the readers know the consensus opinions of scientists in that field.

There is more to doing the math than just reporting the numbers.

Mark Palko is a Los Angeles-based statistician and writer. Find him at West Coast Stat Views and the mathematics education site You Do the Math.