(iStock)

How can anyone take standardized test scores seriously when stuff like this happens?

It’s bad enough that students are required to take high-stakes standardized tests that are often poorly designed and administered, don’t assess what kids have learned, and have “cut scores” deliberately set high so few students can get top scores. What’s more, some of these K-12 “accountability” tests have no consequences for the kids but high stakes for their teachers and schools.

And now there is something new that can skew a classroom’s and school’s results: kids who deliberately do poorly either because they know they have no personal stake in it or because they are protesting the tests. If you don’t think this dynamic is real or important, take a look at what just happened at a leading high school in Washington.

My Post colleague Perry Stein wrote two stories (here and here) detailing a “precipitous” drop in standardized test scores at Wilson High School. The test in question is the Common Core test known as PARCC, short for the Partnership for the Assessment of Readiness for College and Career.

In the District, PARCC is used to judge how well schools are improving student achievement, and starting next year it will be used to evaluate teachers. Some states already use PARCC and similar tests to evaluate teachers, such as in New Jersey, which just announced that PARCC scores will account for 30 percent of a teacher’s rating.

Evaluating teachers by test scores is a method pushed by the Obama administration through its multibillion-dollar Race to the Top initiative and waivers it gave to states from the most onerous parts of the flawed No Child Left Behind, the K-12 education law that was finally replaced by Congress in December, eight years late.

Assessment experts, including the American Statistical Association (the largest organization in the United States representing statisticians and related professionals) warned against using these scores through what is called “value added measurement” or VAM, which purports to be able to take student standardized test scores and measure the “value” a teacher adds to student learning through formulas that can supposedly factor out all other influences. Because the only subjects annually tested are math and English, reformers found strange ways to implement their evaluation systems, such as assessing teachers on the test scores of students they don’t have or subjects they don’t teach. (Really.)

D.C. schools officials just released the results of the PARCC tests taken last spring, and the results were sobering for those who put great stock in them. About 25 percent of D.C. students in grades 3 and up were said to be “college and career ready” in math and English, with a long-time achievement gap between white and minority students continuing. The school that saw the biggest change in scores was, surprisingly, Wilson, one of the district’s best-performing schools. On the English portion of PARCC, Wilson saw a drop of nearly 58 percent of students who met or exceeded expectations. Last year, 50 percent of Wilson students earned top scores while this year only 21 percent did. Still, Wilson saw a 10.5 percent increase in the number of students meeting or exceeding expectations in math.

What happened? Did the composition of Wilson’s student population change dramatically? Did students learn nothing in English Language Arts all year, or have mass amnesia?

As Stein discovered, a lot of Wilson students deliberately bombed the test. Why? One student Stein quoted said she had an important Advanced Placement test on the same day and chose to focus her energy on that. Yes, the school scheduled PARCC testing on the same day as other big tests. But it wasn’t only other tests that prompted some students to bomb the PARCC. Ellen Leander, the mother of a junior, was quoted by Stein as saying that he answered a few questions on PARCC and then walked out because he wanted to attend a chemistry lab.

“He wasn’t thinking, ‘It’s going to be a win for Wilson if I win on this test,’ ” Leander said, adding that the resulting drop in the school’s PARCC score doesn’t mean that instruction is weak or the school isn’t performing. “I’m not concerned with the quality of education at Wilson. I think it’s a strong school with great teachers.”

It turned out, too, there was confusion in the administering of the PARCC at Wilson, with some students in the highest grades given a PARCC test in geometry that they had already taken a few years earlier.

At Wilson, 68 percent of students who were supposed to take the test did take it, school district data showed. An undetermined number were kids who opted out because they oppose high-stakes testing. Though D.C. Schools Chancellor Kaya Henderson refused about 100 opt-out requests, some students got permission from other administrators to sit out the PARCC, further skewing the overall school results.

The opt-out movement has been growing in recent years among parents, students and educators who have come to see high-stakes testing as destructive to public education, causing schools to narrow curriculum to those subjects tested, spending an inordinate amount of time in test prep (including testing pep rallies to get kids ready to take the exams), and assigning stakes to the results that many see as invalid and unreliable. In New York State this past spring, 21 percent of all students opted out of their high-stakes Common Core tests, and significant numbers were reported in Colorado, Washington state and other places.

Wilson wasn’t the only high-performing D.C. school to see scores drop. School Without Walls High School had a 12.4 percent drop in students meeting or exceeding expectations on the English portion, and a 24 percent drop in math. What do those results really show about students at Walls, which requires students to apply?

The overall takeaway: Stuff happens, and a single year’s standardized test scores are — at best — poor measures of how students are doing. This past spring, multiple states either had testing interrupted — or canceled it altogether — because of computer server issues.

There is often no way to know why scores go up in a particular year or down — and placing any real stakes on a single year’s test scores is simply not fair to the person or persons for whom the stakes apply.

This is not to say that all standardized tests are useless. Supporters of the current test-based accountability systems often accuse critics of disliking all tests. That’s silly, at least for most opponents. Good tests, well designed and administered, can be valuable indicators of certain things. But there is plenty of evidence that the testing regime in place in U.S. public schools today is illusory, promising progress but delivering the opposite.