For example, an art teacher in New York City explained in this post how he was evaluated on math standardized test scores, and saw his evaluation rating drop from “effective” to “developing.” High-stakes tests are only given in math and English language arts, so reformers have decided that all teachers (and sometimes principals) in a school should be evaluated by reading and math scores.
Sometimes, school test averages are factored into all teachers’ evaluations. Sometimes, a certain group of teachers are attached to either reading or math scores; social studies teachers, for example are more often attached to English Language Arts scores while science teachers are attached to math scores. (A love of test scores led Washington, D.C., school reformers under former chancellor Michelle Rhee to evaluate every adult in every public school building — custodians and lunchroom workers included — in part on the school’s average test scores, a practice stopped a few years ago.)
In some cases, teachers are being set up to fail with goals that are literally impossible to achieve. How? In Indian River County, Fla., an English Language Arts middle school teacher named Luke Flynt told the school board a tale about his own evaluation that is preposterous — yet true. Flynt’s highest-scoring students wound up hurting his evaluation. How did this happen?
School reformers, including Obama administration education officials, have gotten it into their heads — despite warnings from assessment experts — that linking student test scores to teacher evaluation is a bad practice. They say this because the method by which the determinations are made are not reliable enough and not valid as a measure of achievement. Some economists came up with something called “value-added models” that purport to be able to tease out, by way of a mathematical formula using the test scores, how much “value” a teacher adds to a student’s academic progress. These formulas are said by their supporters to be able to factor out things such as a student’s intelligence, whether the student is hungry, sick or is subject to violence at home. But critics say they can’t.
According to a report by the American Statistical Association warning against the high-stakes use of VAMs:
The measure of student achievement is typically a score on a standardized test, and VAMs are only as good as the data fed into them. Ideally, tests should fully measure student achievement with respect to the curriculum objectives and content standards adopted by the state, in both breadth and depth. In practice, no test meets this stringent standard, and it needs to be recognized that, at best, most VAMs predict only performance on the test and not necessarily long-range learning outcomes. Other student outcomes are predicted only to the extent that they are correlated with test scores. A teacher’s efforts to encourage students’ creativity or help colleagues improve their instruction, for example, are not explicitly recognized in VAMs.
Still, reformers insist on using various value-added models, of which there are many. In Florida, Flynt told the school board March 18 about the absurdities around his own evaluation and urged members to “pause all high-stakes consequences associated” with this year’s test scores.
Flynt explained that through VAM formulas, each student is assigned a “predicted” score — based on past performance by that student and other students — on the state-mandated test. If the student exceeds the predicted score, the teacher is credited with “adding value.” If the student does not do as well as the predicted score, the teacher is held responsible and that score counts negatively towards his/her evaluation.
Flynt said that he had four students whose predicted scores were “literally impossible” because their predicted scores were higher than the maximum number of points that can be earned on the exam. He said:
“One of my sixth-grade students had a predicted score of 286.34. However, the highest a sixth-grade student can earn earn is 283. The student did earn a 283, incidentally. Despite the fact that she earned a perfect score, she counted negatively toward my valuation because she was 3 points below predicted.
But there’s more. He continued:
In total, almost half of the students who counted toward my VAM — 50 of 102 — fell short of their predicted score. That sounds bad. Really, really bad. But a closer look at the numbers is necessary to tell the complete story.Of the 50 students who did not meet their predicted score, 10 percent missed zero or one question, 18 percent missed two or fewer questions, 36 percent missed three or fewer questions, 58 percent missed four or fewer questions.Let me stop to explain the magnitude of missing four or fewer questions. Since the reading FCAT [the test that was given] contained 45 questions, a student who missed four or fewer would have answered at least 90 percent of the questions correctly. That means that 58 percent of the students whose performance negatively affected my evaluation earned at least 90 percent of the possible points on the FCAT.Where is the value in the value-added model? How does all of this data and the enormous mount of time spent testing add value to me as a teacher, to students, to parents or to the community at large. It leads me to wonder what more can I possibly do, when the state issues predictions for my students that are impossible for them to meet, when I suffer financially because of my students test scores, what more can I do?
You may also be interested in:
Is this fair? Art teacher is evaluated by students’ math test scores