Addressing Teacher Evaluation Properly” is the understated title of a new paper that may suggest that the authors have taken a delicate approach to their subject, which has been one of the key issues in the education reform wars for well over a decade. Actually, it would be hard to be much blunter in this statement of theirs:

We all believe that our students should be taught by effective teachers, and a focus on evaluation is one way to achieve this end. But unfortunately, most believe that teacher evaluation is ineffective because all teachers are rated the same (as good) and more and less effective teachers are not identified. Recent efforts to improve teacher evaluation have been costly … and remain ineffective as massive reforms intended to improve practice were unsuccessful. Further, research by educational psychologists, and other social scientists on these new reforms has identified many unintended and harmful effects upon teachers and the teaching profession.

The research brief was written for American Psychological Association scholars Alyson L. Lavigne, an assistant professor of instructional leadership at Utah State University, and Thomas L. Good, professor emeritus in the Department of Educational Psychology at the University of Arizona. It assesses what isn’t working now, explains why and presents alternatives.

Good and Lavigne are the authors of the well-regarded book “Looking in Classrooms,” which uses educational, psychological, and social science theories and classroom-based research to teach about the complexities and demands of classroom instruction. The 11th edition of the book is being published in June.

How teachers get evaluated became one of the most contentious issues in school rooms during the presidencies of George W. Bush and Barack Obama. The Obama administration, through its Race to the Top initiative, pushed states to evaluate teachers by student standardized test scores. They used a method known as VAM, or value-added method or modeling, that purports to be able to use student standardized test scores to determine the “value” of a teacher while factoring out every other influence on a student (including, for example, hunger, sickness, and stress). Evaluation experts warned policymakers that this method is not reliable for making high-stakes decisions.

Though a number of states have minimized or stopped using student test scores to evaluate teachers, some still do, and as Lavigne and Good explain, there are problems with other current teacher assessment practices.

Good told me that teacher evaluation practices today are varied across the nation in part because of large differences that schools have in resources for teacher development. For example, some districts spend $7,000 per teacher while others spend little, and there is varying availability of supervisors “who have the knowledge, skills, and time to do observations and to provide critical and supportive feedback.”

“Some districts have meaningful plans in place but many do not,” he said. “Principals or supervisors make three to four visits a year for untenured teachers and in many districts tenured teachers may only be observed every three to five years. These observations are completed by principal or supervisors who often have had limited training on observing reliably, and who have limited knowledge of how to provide feedback. Observational data are made with checklists and/ or coding systems with unknown validity."

Lavigne said that because value-added scores and commonly used observation instruments “can only do so much,” this is a time to improve assessment methods. “Given that the Every Student Succeeds Act allows for more flexibility in teacher evaluation, the likelihood of this is better now than it has been in the past decade,” she said, referring to the federal K-12 law that replaced No Child Left Behind.

For example, she wrote in an email:

Challenge: Commonly used observation instruments only provide information on how the class is doing on average. Such instruments will not indicate that 2, 3, or even 4 students in a class of 25 did not participate the entire lesson. If most of the class was highly engaged during the entire lesson, a teacher would likely score very high for “student engagement.”
Opportunity: Perhaps supplementing by using a seating chart to mark participation would illuminate and provide feedback to the teacher about those 2-4 students that did not participate, and even possibly as to why (e.g., was the teacher requiring all students to participate?). There also exists technology that can now record classroom lessons and provide feedback on how many voices were heard, how much time the teacher talked vs. students, again providing different, and sometimes better information to help a teacher improve practice.

There’s a lot more in the policy brief, which is itself brief. You can read it here: