Ronald L. Wasserstein is executive director of the American Statistical Association.
In such a scenario, you might have more than a few questions for your doctor. Or you might put your favorite malpractice lawyer on alert.
The nation’s population is about to undergo its decennial checkup. This massive examination is one of our country’s most carefully structured operations. Why? Because the United States depends on the data the census generates to allocate billions of dollars of taxpayer money fairly and distribute the 435 legislative seats in the House of Representatives. For that reason, every aspect of the census is carefully checked and rechecked in advance so that the resulting data are accurate and reliable. Until now.
Accuracy and reliability appear to be of no consequence to Commerce Secretary Wilbur Ross, who ordered the Census Bureau to include an untested question in the census form that asks participants whether they are U.S. citizens.
In January, a federal judge ruled that Ross, who will appear before the House Oversight Committee on Thursday, “egregiously” violated federal law in adding the citizenship question. My organization, the American Statistical Association, the world’s leading professional community of statisticians, goes further: It is statistical malpractice.
Given sufficient time, a question such as this can be properly tested in accordance with scientific standards to be sure that it is not confusing, easily misinterpreted, taken out of context or otherwise creates problems for census respondents. What’s more, the Paperwork Reduction Act and the standards derived from it demand that such questions be well-tested in advance.
You don’t have to be a statistician to know that a poorly worded question leads to an unreliable answer. Just ask a child if they have cleaned their room recently. They often have different understandings of “cleaned” and “recently” than you. Now consider citizenship — fraught with emotion and complicated by language barriers — not to mention levels of legality and status. It is essential to get it right. That happens through testing, and testing takes time.
A second troubling issue is that of nonresponse. The Census Bureau must count everyone, so those who do not respond to the initial outreach by mail must be contacted personally. The cost of personal contact is extremely high. Driving up the cost of the decennial census to collect citizenship information — which is already collected annually through the American Community survey — violates the law and its implementing guidelines.
Nonresponse further leads to undercount. Undercount means what it says — not counting every human being in the country, which is what the Constitution requires the census to do. Individuals may fear responding to the census because someone in their household or in their orbit is not a citizen, whether or not they are legally in the United States.
How many people will refuse to respond? The Census Bureau doesn’t know. Ross doesn’t know, either. That’s why there is a valid scientific process for developing and testing such a question.
Ross indeed argued this exact point, claiming in a 2018 memo on the citizenship question that no one has proved that response rates will be lowered. No one should have to. The government’s own policies, as embodied in the Information Quality Act, require that it act responsibly to ensure that data is accurate and reliable. Census data is far too important and vital to our democracy to be treated carelessly.
Every year, hundreds of millions of people receive physical examinations. Lives are saved because those examinations, which have been tested and validated before they are applied, are conducted in ways that allow the collection of accurate and relevant health information that can be compared over time.
Every decade, we count our citizens. That count is increasingly complex, as our population grows, moves and changes. But the system for counting cannot be subject to an administration’s whims. Counting everyone is a science. Failing to pretest, and deliberately risking data quality, is statistical malpractice.