By Bill Turque
Washington Post Staff Writer
Tuesday, April 7, 2009
While talks between D.C. Schools Chancellor Michelle A. Rhee and the Washington Teachers' Union remain stalemated over salary and job security issues, one critical question is not even on the bargaining table: how the District's educators will be evaluated.
For months, Rhee and her chief "human capital" assistant, Jason Kamras, have been working on an overhaul of the evaluation system that would expand the ways teachers are assessed. In addition to a system of classroom observations and conferences, it is likely to include methods to track how students' standardized test scores grow over time. Several major school systems, including those in Houston, Chicago and Milwaukee, have started limited use of this new "value-added" approach.
Rhee is under no obligation to bargain with the union on evaluations, though the union wants to see it on the table and said so in the contract proposal it delivered a few weeks ago. Congress gave the school system sole authority over the issue in the mid-1990s after the WTU refused to renegotiate the then-existing evaluation system with the District.
Instead, Rhee has invited teachers to a series of 20 focus groups over the next several weeks to ask for their input in shaping a new evaluation process. In the end, the union's only option is to file an unfair labor practice with the D.C. Public Employee Relations Board if it believes that the evaluation plan will have a negative impact on teachers.
Neither side is happy with the current teacher evaluation system, which involves a series of classroom observations by principals, who often have neither the time nor the expertise in subject matter to render a fair judgment on a teacher's effectiveness. Nevertheless, the system has been used to put at least 150 teachers on the so-called 90-day plan, which places them on notice to improve their performance or face dismissal.
Rhee, who in her long-range plan for the school system says she intends to replace a significant number of the District's 3,500 teachers, referred questions on the issue to spokeswoman Dena Iverson, who said in a statement: "While the evaluation process is not subject to collective bargaining, we are committed to listening to our teachers and their ideas. The 20 focus groups over this next month are just the first of many opportunities for teachers to give us feedback as we begin to develop an evaluation told that is fair to teachers and promotes student achievement."
Rhee has said that growth in test scores is not the only yardstick she intends to use in assessing teachers. Less than a third of the District's teachers are in grades or subject areas in which standardized tests are given. In a letter to teachers last month, Rhee promised a system with "multiple measures of performance," including the use of "impartial master teachers" -- an idea promoted by the parent union, the American Federation of Teachers -- to eliminate the possibility that personality conflicts or inexperience might taint an evaluation.
"You deserve to be evaluated fairly and responsibly," Rhee wrote.
Leaders of the WTU and the American Federation of Teachers are concerned that administrators will not convey the statistical complexities of the value-added model. Some experts say it poses technical risks that can make it an unreliable vehicle for judging a teacher's effectiveness.
WTU President George Parker said that he is open to some sort of value-added approach but that it is best developed in collaboration with union leadership.
"This is a very tricky business and requires a lot of input and expertise," Parker said. "It needs to be worked out with all of the available research the national union and experts across the country have."
Policymakers and educators have long debated how best to judge a teacher's contribution to student progress. Besides direct observation, some school districts look at portfolios of student work or rely on peer or parental assessments. The dramatic increase in annual standardized testing, triggered by passage of the federal No Child Left Behind law, has provided much more data on student achievement, but it has been largely used for static "snapshots" that show test scores relative to federal benchmarks.
Rhee says the current system does not adequately highlight students who may fall short of proficiency levels in reading or math but who still make significant strides over the course of a school year. She wants to use test data to render more sophisticated judgments about student growth and the effectiveness of individual teachers. To assist the District in developing the value-added model, Rhee has hired Thomas J. Kane, a professor at the Harvard Graduate School of Education and faculty director of its Project for Policy Innovation in Education, and she also has retained Mathematica, a research firm.
Value-added systems vary, but most use statistical modeling to project expected rates of test score growth in a given year for individual students, allowing for factors such as past performance and economic status.
But even experts who regard a value-added system as an improvement over current evaluation schemes caution that it comes with serious potential pitfalls. One is that the smaller the student sample, the more statistically unreliable the result. In other words, measuring test score growth across a school, or even a grade within a school, is more valid than looking at a single teacher and a class or 15 or 20 students.
"The more observations you have, the more confidence in the conclusions you draw," said Douglas N. Harris, an assistant professor of educational policy studies at the University of Wisconsin who has studied value-added models.
He said that it takes at least three years of data to make any "high-stakes decisions" about a teacher, such as termination, and that as far as he knows, none of the school systems that employ a value-added model use it as the sole basis for personnel decisions.
He also said that value-added systems can highlight very good teachers, and poor ones but have difficulty with "fine-grain distinctions" about those in the middle.
Overall, Harris said, "I think it's promising, but we don't really know how promising."