For all that President Obama and Education Secretary Arne Duncan talk about wanting to move beyond “bubble tests,” the high-stakes role of standardized testing in public education as a result of their policies can hardly be overstated. Here’s a new look at testing by Walt Gardner, who writes the Reality Check blog for Education Week. Gardner taught for 28 years in the Los Angeles Unified School District and was a lecturer in the UCLA Graduate School of Education. He uses his background to put educational issues in context for readers.
By Walt Gardner
When I was at UCLA working on my California teaching credential in the mid-1960s, I couldn’t understand the caveats about the use of standardized tests. It’s the remembrance of my naivety then that makes me tolerant about a similar misunderstanding on the part of most people today. In fact, if I hadn’t taught for 28 years in the Los Angeles Unified School District, I would in all likelihood share their view.
I thought of the change in my thinking because the Obama administration is urging states to evaluate teachers in part on their students’ standardized test scores. Consider the Chicago teachers strike. At its heart, it was about the weight given to standardized test scores in evaluating teachers.
As I read letters to the editor in numerous newspapers about the strike, it was apparent that the overwhelming majority saw no reason to doubt the ability of such tests to judge the effectiveness of teachers. In their minds, if teachers were doing their job well, their students’ standardized test scores would be high — period. (At least 30 states and the District of Columbia presently evaluate teachers in part this way.)
Trying to show readers the errors in their thinking is a Sisyphean task because it invariably involves getting into a technical discussion that is guaranteed to lose them. I’ll try to avoid doing so by asking a simple question: Why test at all? The answer is not as self-evident as it seems.
Testing is as much a check on teachers as it is on students. When teachers say they taught something well, they assume their students learned what they taught. But how do they know? Relying on their instinct may be reliable, but then again it may not. The only way to confirm their instinct is to get feedback. It’s here that the real trouble begins.
Feedback can take many forms. But reformers today demand hard data. Enter standardized tests. They seem ideally suited for the purpose because they provide what Alfie Kohn calls ” the simplicity of specificity.” But any test score has to be interpreted. Otherwise, it is virtually meaningless. No test has intrinsic validity. It’s only the inferences made that are valid or invalid. To be confident about inferences, therefore, it’s necessary to gather evidence to support the accuracy of the interpretation.
Unfortunately, ideology too often gets in the way. One of the ironies of relying on standardized tests to rate teachers is on display in Tennessee. It was at the University of Tennessee that William Sanders constructed the controversial value added model, which has been in place in the state since 1992. Yet despite initial enthusiasm, Tennessee intends to make revisions to its evaluation system after data showed a “disconnect between test scores and teacher ratings.
It’s not that standardized tests serve no purpose. On the contrary, they can be quite useful. For example, Finland uses the tests strictly for diagnostic purposes, but the results are not made public. I don’t attribute Finland’s reputation for having the world’s best schools exclusively to this practice, but it is one factor that cannot be dismissed out of hand.
In contrast, standardized tests in the United States are an obsession, as reflected in the move to administer them in kindergarten. So how should teachers be evaluated? I think the fairest way is to rely on multiple measures. These include ratings by principals, peers, parents and students. The weight given to each input can be negotiated. The emerging comprehensive picture would be far better than any standardized test CURRENTLY IN WIDESPREAD USE.
The only losers would be testing companies, which have had the field all to themselves for far too long. I doubt my recommendation will go farther than this column because too much money is at stake. Nevertheless, it’s worthwhile trying.