To help schools meet the new requirement to evaluate teachers based on student achievement, Virginia officials created a method for calculating how much students learned in a year. By extension, they believe that the same method can show how well teachers are doing their jobs.
The Board of Education recommended that schools use these “student growth percentiles” — or measures of student progress based on standardized test scores — as one of several ways to rate teachers. But as the new evaluations have taken effect, most school districts have been ignoring the recommendation and developing their own measures, teacher by teacher.
Officials in Alexandria and in Arlington and Fairfax counties call the growth percentiles too limited to be useful, at least for now: They don’t show changes in achievement for very high-performing students, and they rely on consecutive years of test data not widely available in schools serving more-transient communities. All told, the calculation can be completed for less than 30 percent of teachers.
“There are a lot of caveats around them,” said Sue Sarber, professional development supervisor for Arlington Public Schools.
Such growth measures are being developed across the country as more than two dozen states have adopted tougher evaluations that tie teacher pay or tenure more closely to student performance. But the stalled implementation in Virginia underscores the practical challenges of using standardized test scores to rate teachers.
Although state tests are one of the most widely available sources of information about student achievement, they are offered only in certain subjects and grade levels and weren’t designed to evaluate individual teachers, so schools still need to come up with their own additional yardsticks.
Evaluations typically have consisted of principals’ visits to observe a teacher at work or a list of professional development activities, subjective measures that resulted in little distinction between great, mediocre and terrible teachers.
With an infusion of federal funds, dozens of states are working toward a more objective view of how students “grow” or learn over time — and holding teachers accountable for that growth.
Virginia agreed to make student academic progress count for 40 percent of its new teacher evaluations in order to receive a waiver from the federal No Child Left Behind law. The state developed its growth model as part of a $17.5 million federal stimulus grant awarded in 2009.
Student growth percentiles, first developed in Colorado, rank students on a scale of 1 through 99, according to how they scored on state reading or math tests relative to those who performed similarly on previous tests.
A score of 90, for example, would mean that a student scored higher than 90 percent of students who had similar academic performance the previous year. A high score could indicate significant personal improvement and a year of great teaching. Or not.
Such measures can prompt questions about why students in some classes, grade levels or schools are improving more than others, said Damian Betebenner, a senior associate at the Center for Assessment in Dover, N.H., which developed the system in Colorado on which the Virginia system is based.
But when applied to individual teachers, the measures are meant to be part of a broader set of evidence demonstrating a teacher’s impact on student growth, which is why Virginia officials intend for them to be used along with other measures.
Many educators say state tests alone are not the best tool for determining what students learn. Written to assess grade-level skills, such tests are particularly limited when evaluating the growth of high-performing students who score at or near the top on the tests every year.
Questions also swirl around how to interpret the available data. Experts say it is difficult to isolate the effectiveness of one teacher from the effectiveness of other teachers or tutors who work with students. It is also hard to know the role that out-of-classroom factors, such as poverty, language ability and home environment, play in student performance.
Value-added measures — an evaluation model embraced in many states and the District — use a complex algorithm that attempts to account for such variables to isolate the teacher’s influence, boiling it down to a single score. Reports have shown that value-added measurements can produce unreliable results when applied to individual teachers.
“It’s very difficult to get a clean, precise metric,” said Linda Darling-Hammond, an education professor at Stanford University who has studied such systems.
Without a widely available growth measure, districts in Northern Virginia are asking teachers to illustrate student growth primarily by setting goals. At the beginning of the year, teachers and administrators review students’ performance and set specific targets for progress.
State guidelines say teachers must use reliable measures, but that could mean reading fluency tests for a special education teacher, abdominal fitness tests for a physical-education teacher, and end-of-unit tests for a math teacher.
To determine student progress, Mike Bonfadini, a seventh-grade math teacher at Bull Run Middle School in Gainesville, used results from pre-tests, post-tests and retests for different chapters or units, as well as quarterly benchmark tests.
The school’s new teacher evaluations include at least two classroom observations by administrators and measure six other standards — professional knowledge, instructional planning, instructional delivery, assessment of learning, learning environment, and professionalism.
In the end, Bonfadini was rated in the highest of four categories. He said he was already monitoring student progress but that the new evaluation made him report it more carefully.
At the end of the day, his job is the same, he said: “You make a rapport with the students and help them learn to the best of your ability.”