For three years, 50 percent of the evaluations of many D.C. public school teachers were based on students standardized test scores, a key part of the ground-breaking IMPACT assessment system introduced by Michelle Rhee.

Kaya Henderson, left and Michelle Rhee (Ricky Carioti/WASHINGTON POST)

Sounds reasonable, right?

It isn’t.

When Rhee set the percentage at 50 percent back in 2009, after spending millions of dollars to create the assessment system, she based it on absolutely nothing grounded in research. The 35 percent? Also based on no research.

In fact, there is no research to show that even 1 percent would be a valid way to assess teachers, because standardized tests that are being used in a number states — where officials followed Rhee’s example — are not designed to evaluate teachers. There are enough questions about how valid they are in evaluating students to make it unfair to use the scores in any high-stakes decisions for kids, much less teachers.

What will take the place of that lost 15 percent? Apparently other measures of student achievement, including student performance on final exams or early literacy tests. More evaluation based on testing.

Why did Henderson institute this and other changes in IMPACT? She said the revisions are in part a response to complaints that the system was too rigid and dependent on test scores. She told Brown: “I’m not stuck with what we thought was right in ’08, or too stubborn to ignore what we’ve learned over the last three years.”

There’s no reason to think Henderson doesn’t want to improve the evaluation system. She’s smart and certainly sees some of its flaws.

But too bad she hasn’t read, or taken to heart if she did read it, the comprehensive 2011 report by the National Research Council, the research arm of the National Academies (which include the National Academy of Sciences, the National Academy of Engineering and the Institute of Medicine), that says that standardized test scores are of limited value in determining causes of improvements in student performance.

“Looking at test scores should be only a first step – not an end point – in considering questions about student achievement, or even more broadly, about student learning,” it says.

There are other changes to IMPACT, too, in what is really the biggest revision since it was implemented. Though only some teachers were evaluated in part by test scores, all teachers have been evaluated in part by classroom observations — done by principals and master teachers.

At first, teachers were judged by five half-hour observations over the course of a year, and in each of those observations, they had to show 22 separate teaching competencies. These included tailoring instruction to at least three “learning styles,” and demonstrating that they were instilling student belief in success through “affirmation chants, poems and cheer.”

This was so off the wall that the 22 were whittled down to nine different teaching elements in 30 minutes. Now, under the new revisions, only four of the five observations will be included in the evaluation and the fifth will be strictly for feedback. Teachers with consistently high ratings will only get three formal observations.

Still, it is artificial to ask a teacher to prove nine different teaching competencies in the space of every 30 minutes; it’s not the way classrooms work. As a result, some D.C. teachers find out beforehand when they will be observed and have a ready-made lesson that will pass the IMPACT test ready to pull off the shelf.

Besides, it is bordering on ludicrous to judge a teacher based on two hours of teaching a year.

Ninety eight teachers were just fired as a result of their IMPACT evaluations. Maybe all of them were ineffective teachers; maybe not. It's impossible to know with a flawed teacher evaluation system.


Follow The Answer Sheet every day by bookmarking .