As any even semi-regular reader of this blog knows, the practice of using student standardized test scores to evaluate teachers is riddled with problems. I’ve written before about some of the more ridiculous consequences, such as teachers being evaluated by students they don’t have and/or by subjects they don’t teach. (See here and here.) There are other consequences as well, some of them likely unintended. Here’s a post on the subject by Susan Moore Johnson, Jerome T. Murphy Research Professor in Education at Harvard Graduate School of Education. Johnson directs the Project on the Next Generation of Teachers, which examines how best to recruit, develop, and retain a strong teaching force. This appeared on the the Shanker Blog, the voice of the Albert Shanker Institute, a nonprofit organization established in 1998 to honor the life and legacy of the late president of the American Federation of Teachers.
By Susan Moore Johnson
Academic scholars are often dismayed when policymakers pass laws that disregard or misinterpret their research findings. The use of value-added methods (VAMS) in education policy is a case in point.
About a decade ago, some researchers reported that teachers are the most important school-level factor in students’ learning, and that that their effectiveness varies widely within schools (McCaffrey, Koretz, Lockwood, & Hamilton 2004; Rivkin, Hanushek, & Kain 2005; Rockoff 2004). Many policymakers interpreted these findings to mean that teacher quality rests with the individual rather than the school and that, because some teachers are more effective than others, schools should concentrate on increasing their number of effective teachers.
Based on these assumptions, proponents of VAMS began to argue that schools could be improved substantially if they would only dismiss teachers with low VAMS ratings and replace them with teachers who have average or higher ratings (Hanushek 2009). Although panels of scholars warned against using VAMS to make high-stakes decisions because of their statistical limitations (American Statistical Association, 2014; National Research Council & National Academy of Education, 2010), policymakers in many states and districts moved quickly to do just that, requiring that VAMS scores be used as a substantial component in teacher evaluation.
While researchers continue to analyze and improve VAMS models, it is important to step back and consider a prior set of questions:
- Does the wide variation in teachers’ effectiveness within schools simply mean that some teachers are inherently better than others, or is there a more complex and promising explanation of this finding?
- Is the strategy of augmenting human capital one teacher at a time likely to pay off for students? Or will relying on VAMS for teacher evaluations have unintended consequences that interfere with a school’s collective efforts to improve?
In this column, I bring an organizational perspective to the prospect of using VAMS to improve teacher quality. I suggest why, in addition to VAMS’ methodological limitations, reformers should be very cautious about relying on VAMS to make decisions that have important consequences for both teachers and their students.
Why Is There Variation In Teacher Effectiveness Within Schools?
In his classic analysis, “Social Capital in the Creation of Human Capital,” James Coleman (1988) argues that individuals’ human capital is transformed for the benefit of the organization by social capital, which “inheres in the structure of relations between actors and among actors” (p. S98). In education, this suggests that whatever level of human capital schools acquire through hiring can subsequently be developed through activities such as grade-level or subject-based teams of teachers, faculty committees, professional development, coaching, evaluation, and informal interactions. As teachers join together to solve problems and learn from one another, the school’s instructional capacity becomes greater than the sum of its parts.
Unfortunately, U.S. schools were never designed to benefit from social capital. In fact, over 40 years ago, historian David Tyack (1974) and sociologist Dan Lortie (1975) depicted the school as an organizational “egg crate,” where teachers work in the isolation of their classroom. In egg-crate schools, teachers focus on their own students largely to the exclusion of others, and they interact minimally and intermittently with their colleagues. As a result, their expertise remains locked within their classroom (Darling-Hammond 2001; Hargreaves & Fullan 2012; Johnson 1990; Kardos & Johnson 2007; Little 1990). This egg-crate model was efficient for managing the “factory school,” but did not serve students well; nor does it support the instructional needs of today’s teachers.
Therefore, when teachers in the same school continue to work in isolation, they cannot benefit from the social capital that their school might provide. As a result, wide differences in teachers’ effectiveness persist over time.
The Evidence On School-Based Improvement Efforts
Studies have persuasively documented the benefits of systematic efforts to improve student learning through school-based improvement initiatives (Bryk, Sebring, Allensworth, Easton, & Luppescu 2010; McLaughlin & Talbert 2001; Rosenholtz 1989). Successful efforts increase norms of shared responsibility among teachers and create structures and opportunities for learning that promote interdependence—rather than independence—among them. That is social capital at work.
Many who dismiss the potential of social capital to improve schools doubt that teachers can improve significantly over time. However, a recent study by Kraft and Papay (2014) showed that teachers working in more favorable professional environments—as rated by a school’s staff—improved throughout the 10 years they analyzed, while those who worked in environments judged to be less supportive stagnated. This and other studies challenge the conventional view that teachers reach a “plateau” in their development relatively early in their career (Rivkin et al. 2005). Creating a school context that supports teachers’ work can have important, lasting benefits for students and faculty throughout the school, whereas simply swapping out low-scoring for a high-scoring individuals without changing the context in which they work probably will not (Ladd & Sorenson 2014; Leana 2011; Lohr 2012).
Threats To School-Based Improvement Efforts
Not only are personnel polices based on VAMS scores likely to have, at best, modest effects on a school’s success, they may inadvertently undermine improvement efforts that are already underway. How so? Here, I suggest several possible unintended consequences of increasing reliance on VAMS (for a more detailed discussion see here).
1. Making It More Difficult to Fill High-Need Teaching Assignments
Teachers’ confidence in VAMS as an evaluation method ultimately depends on whether these measures adequately control for demographic differences among students. Many experts report that VAMS do not yet do so. Although teachers may not have read these scholarly critiques, they generally are not convinced that VAMS are evenhanded. Thus, heavy reliance on VAMS may lead effective teachers in high-need schools and subjects to seek safer assignments, where they won’t risk receiving low, unwarranted VAMS scores.
2. Discouraging Shared Responsibility for Students
Often teachers within a grade level capitalize on one another’s strengths by regrouping their students for better instruction in each subject. For example, an excellent math teacher will teach math to all students in the grade, while others specialize in their area of expertise. Using VAMS to determine a substantial part of teachers’ evaluations threatens to sidetrack such collaboration by providing a perverse incentive for the most effective teachers to concentrate solely on their assigned roster of students.
3. Undermining the Promise of Standards-Based Evaluation
Those who recommend using VAMS for personnel decisions often contend that this approach is superior to the “counterfactual”— evaluations conducted by administrators. Admittedly, those evaluations had a poor track record in the past. Recently, however, many districts have adopted sophisticated and informative standards-based assessments. Recent research demonstrates that teachers’ instruction improves in response to standards-based observations and high-quality feedback (e.g., Taylor & Tyler 2012). But how will administrators respond when discrepancies between VAMS and observations arise? If they are uncertain about judging instruction or think that VAMS are more precise than their own professional judgment, value-added scores may unduly influence how principals rate teachers’ instruction.
4. Generating Dissatisfaction and Turnover Among Teachers
Those who promote the use of VAMS to make decisions about rehiring, firing, or awarding tenure often suggest that the best teachers will be more satisfied and decide to remain in their school once ineffective teachers have been dismissed. However, if the dismissal process requires more testing or diverts teachers from collaborating, skilled teachers—who arguably have the most to offer the school—may lose confidence in administrators’ priorities and decide to go elsewhere, even if that takes them out of education.
There is reason therefore for policymakers and administrators to carefully weigh the potential costs and benefits of relying on VAMS for evaluating teachers. Some states now require using VAMS scores for 30 percent to 50 percent of a teacher’s final evaluation, an approach that is unsupported by research. It may be that eventually such policies will have their intended effects—raising professional standards to make teaching more attractive and reducing the variability in teachers’ effectiveness through dismissals. However, it is also quite possible that relying on VAMS in evaluation will make it more difficult to staff high-need classes, promote and sustain collaborative work, and develop shared responsibility among teachers for supporting students’ learning and improving the school. In response to these effects, turnover rates may increase, even among the very teachers whose expertise and commitment could generate improvement among their colleagues.
I’m certainly not suggesting that schools continue to employ ineffective teachers. As I have argued elsewhere (Johnson 2012), “neither individual teachers nor the schools in which they work can be ignored if students are to have the instruction they deserve.” However, reformers should lead the way with efforts to improve the school throughout, making it an organization that supports effective teaching and rich learning in every classroom. (See here.)
Research thus far has focused almost exclusively on the technical side of VAMS, determining under what conditions these models can safely and sensibly be used. Although these efforts have been worthwhile, it is time for other researchers to focus on how using VAMS affects what teachers actually think and do. There may be no strong evidence that the intensified use of VAMS interferes with collaborative, reciprocal work among teachers, but we should not assume that such consequences do not exist.