By Jordan Ellenberg
Saturday, May 1, 2010; A15
Starting today, thousands of census workers will scour the country, town by town and block by block, trying to identify which addresses have residents and how many they have. The workers' goal: to combine these numbers into a precise reckoning of the American population. As always, they will fail.
They'll get close, sure. But by the Census Bureau's best estimate, the 2000 Census counted more than 5 million people twice and millions more not at all. Such errors crop up in every census, demonstrated visibly in 1940, when nearly a half-million more men registered for the draft than officially existed. And those errors aren't demographically uniform; the bureau estimated that the 2000 Census undercounted the black population by about 600,000 while bumping up the number of whites by more than 2 million.
How did it know? Because it used a method called statistical sampling to assess the accuracy of its findings, recounting a small selection of addresses after the fact and checking how well the two enumerations agreed. For decades, statisticians inside and outside the census have lobbied with no success to adjust official counts to reflect the information gleaned from statistical sampling. The bureau is stuck in the position of being required to check its work but forbidden to correct it.
Census adjustment has long been a political flash point because the census has winners and losers; an undercount in New York and Chicago could mean fewer members of Congress from blue cities and more from red exurbs. But resistance to adjustment is only partially driven by political interest. It also represents a worrisome mathematical Luddism; adjustment opponents depict statistical estimates as hunches dressed up in fancy mathematical clothes or even plots designed to hijack the census for political ends.
Among the opponents is Sen. Judd Gregg (R-N.H.), who withdrew as the Obama administration's nominee to be commerce secretary in part because of disagreements over the census. Gregg's take on adjustment: "You take guesses based on what you think is the best political outcomes that you want, rather than counting people who actually exist."
But statistical adjustment is not a guess. It's a measurement of something that can't be directly observed, which deserves the same status as the data obtained from any other advanced scientific instrument. The images from an electron microscope may be blurry, and mediated through complex mathematics the layperson can't understand, but they're not guesses. They're facts.
The skepticism that people like Gregg apply to statistics, if applied to other sciences, would get them lumped with the anti-vaccinationists and the homeopaths. The difference? Everyone knows that physics, chemistry and biology have changed radically in the past hundred years; the tools available are fantastically more powerful and reliable than those of the past. Math, by contrast, is taught as if Isaac Newton supplied the final word on the subject.
But survey techniques of the kind the Census Bureau uses didn't exist before the 20th century, and only recently have they been refined enough to improve the accuracy of the census. Justice Clarence Thomas, in a 2002 opinion on sampling, asked: If the Founders had meant the Census Bureau to use statistical sampling technology to improve its count, wouldn't math whiz Thomas Jefferson have used it in the first census? The question is nonsensical. Jefferson could no more have done so than he could have surveyed the population from space.
Note, too, that resistance to statistics is selective. You seldom hear calls for the abolition of data on the gross national product or the unemployment rate, though both are derived from statistical samples. Many who balk at using adjustment to ensure equal representation in Congress are comfortable relying on statistical arguments about DNA when it comes to capital punishment.
Opponents say that statistical adjustment would violate the constitutional requirement of an "actual enumeration" of the population. Justice Antonin Scalia wrote in 1998 that the Constitution's language was "arguably incompatible . . . with gross statistical estimates." The sampling adjustment is indeed an estimate of the population -- but so is the unadjusted number, which estimates that the number of Americans missed is zero! To choose the raw count is to be wrong on purpose in order to avoid being wrong by accident.
In any event, the current system is anything but a plain count of heads. Since 1970, a mail-in survey has provided the majority of census data, so what we enumerate is not people but numbers written on a form, which are as likely to be fictional as any statistical estimate. Houses that don't return a form are visited by census workers, but when workers are unable to determine the number of occupants at an address, the census uses statistical properties of the surrounding area to make its best estimate of the ungatherable data. The Supreme Court explicitly upheld this process, called "imputation," against a 2002 legal challenge. (The constitutionality of adjustment by sampling has never been directly addressed by the court.)
There are some good reasons to be cautious about sampling adjustment. One could argue that the complex methodology would leave the census vulnerable to politically motivated tampering -- though it seems just as likely that, like any mechanism of double-check, adjustment would serve as a brake on tampering. But any such argument has to explain why the government isn't making its best attempt, using available technology, to meet the Constitution's plain demand that we count "the whole number" of Americans -- not just the ones who are easy to find. Instead, we've contented ourselves with an enumeration we know not to be actual. We shouldn't be asking the court whether it's constitutional to adjust the census; we should be asking whether it's constitutional not to.
The writer is associate professor of mathematics at University of Wisconsin -- Madison.