I have nothing but the highest respect for the authors of a remarkable new report, “Passing Muster: Evaluating Teacher Evaluation Systems,” just released by the Brown Center on Education Policy at the Brookings Institution.
They include Steven Glazermann of Mathematica Policy Research, Dan Goldhaber of the University of Washington, Susanna Loeb of Stanford University, Stephen Raudenbush of the University of Chicago, Douglas O. Staiger of Dartmouth University and Grover J. Whitehurst of Brookings.
They are among the smartest people in the country when it comes to figuring out how to measure what happens in the classroom and how to use those techniques to make our schools better. The member of the team I know best, Goldhaber, showed me personally how good he was at that when he was a member of the Alexandria, Va., school board, which I covered for The Post a decade ago.
But after reading their 36-page examination of how states and the federal government might best encourage sensible evaluation of our public school teachers, I am almost completely convinced that we are never going to get this right. They lay out the alternatives pretty clearly, but in the end what they recommend is more or less a muddle — hard to understand, impractical, impolitic and not in tune with how schools work.
I still think it is worth reading because it is such an intelligent — and at least at the beginning comprehensible — summary of where we are in research and policy on identifying the great teachers we might want to recognize and reward and the bad teachers we want to help or counsel on another line of work.
The authors say confidently that studies of value-added assessment of teachers — using student test score gains to measure teacher success — show that the data predict in a useful way which teachers will produce student test gains in the future and which will not. That is good to know, although I am willing to bet a significant sum that some experts will not accept that conclusion.
The authors of “Passing Muster” point out all the pitfalls of using value-added data, and suggest ways to combine it with other measures to create systems most could accept. They raise hopes for future research that will make principal evaluations of teachers or student surveys regular parts of the evaluator’s tool kit.
But eventually their many cautions and qualifications and definitions turn the piece into mush. Here is a sample:
“Specifically, the proportion of highly-effective teachers that will qualify for special treatment will depend on a) two critical values adopted by policymakers (what we call exceptionality and tolerance), b) three correlations calculated using teacher-level data for teachers within each district, and c) a count of the teachers in each district who are subject to the full evaluation system including value-added measures and a count of the number who are subject only to the non value-added components (e.g., because they teach in untested grades and subjects).”
I know what they are talking about. Some policymakers may even see a way to use this approach. But it will be lost on many parents, students and taxpayers. Many teachers will not like or trust this complicated way of distinguishing between good instructors and bad ones.
Please read the report and tell me where I am wrong. I like the people who are pushing for value-added assessment to be used in evaluating individual teachers. I wish them luck. But I don’t see how they are going to pull it off.
I prefer systems that evaluate schools, not individual teachers. Whole school assessments allow good principals to create a team atmosphere in which no one feels more favored than anyone else, and all are focused on helping kids.
I favor depending on the evaluation decisions of a well-trained and carefully selected principal than relying on the complicated numbers that the authors of this report want to use in assessing teachers.
Maybe I just don’t get it. If so, please tell me why, and how this is going to work if we continue in the direction this report points us toward.