Alert: These data do not — and, as provided, cannot — provide an accurate picture of improvement. These PARCC data do not and cannot tell us whether one school’s students learned more than another’s — or whether they learned more this year than last (though at some schools, they surely did). They cannot tell us whether one school’s special-education or at-risk students were better served than those at another.
Before explaining why this is so, and how to fix it, I must say that these data do capture the city’s huge achievement gaps. For example, the proficiency rates of African American and white students differ by a staggering 57 percentage points.
Known as “status” or “snapshot” data, these data are important, showing clearly where the greatest educational need exists.
But, for two big reasons, they don’t tell us whether, where or why improvement is happening. First: These data aren’t based on the actual progress of actual students. They compare last year’s students with this year’s. But the students are different. One grade has graduated and left the school, while another has entered; some students have moved away, while others have moved in. Comparing these different student groups is always problematic. Teachers may comment: “Wow, this year’s class learns so quickly!” (Or, the opposite.) Test scores go up or down depending on the composition of different student groups, despite no change in school (or teacher) quality.
Sometimes, changed student composition can have a particularly big effect on school scores. For example, a district could move a school’s special-education program — and all of its students — to another school. Let’s stipulate that school quality and achievement in both the initial and new school stay the same. Nonetheless, because, on average, students with disabilities score lower than their peers without disabilities, the overall scores of students in the initial school will likely be higher, and those in the new school lower — even though not a single student’s progress is different from the previous year.
We can celebrate the initial school’s higher scores (or fret about the new school’s lower ones) until the cows come home. But school quality hasn’t changed; only the students have. Likewise, when a school sees an influx of either high-achieving or low-achieving students — perhaps because of a new academic program or gentrification — scores might go up and down, but not because school quality has changed.
Second: Because these PARCC data track proficiency, not test-score gains, they underreport the progress of schools with many low-achieving students who are far below proficiency and overreport it for schools with many high-achieving students close to proficiency. Imagine a school where 25 percent of students test at the proficient level and all others are far below it; every student progresses enormously, but because the students began so far behind, the school’s proficiency rate increases by zero. In another school, 60 percent of students are already considered proficient and 10 percent are very close. Schoolwide achievement is modest, but all students close to proficiency reach it, boosting school proficiency by 10 points.
Based on these proficiency data, the sluggish school seems hugely better than the dynamo. But it’s not.
A working paper by Steven M. Glazerman and Liz Potamites, published by the well-regarded think tank Mathematica, studied whether proficiency gains/losses between one year’s students and the next year’s students “were reasonable approximations of” their real progress. They were not: About a quarter of the time, the approximations were so bad that the schools/grades initially ranked as top performers in fact had below-average score gains.
When we identify success where it doesn’t exist and ignore it where it does, we do enormous harm to our educational enterprise and its decades-long effort to improve. We besmirch schools that deserve congratulations and lull others into believing their sluggish growth is okay. Because we don’t know where genuine improvement exists, we can’t effectively encourage its spread.
We should change how we report initial test data each August. The lead agency for these reports is the Office of the State Superintendent of Education (OSSE), which has been a leader in releasing many other data, including school and subgroup information included in school report cards and equity reports. I encourage it to lead here, too. Next year, when it reports its annual PARCC snapshot data, instead of reporting changes in the proficiency rates of two different sets of students, I urge it to report average score gains for the same students, the method researchers have found to be much more accurate. Plus, the OSSE should open its data (cleaned of identifying information) to independent researchers, who could bring their expertise to it.
With these changes, we could more confidently celebrate genuine improvement, not just statistical artifact.