The Washington PostDemocracy Dies in Darkness

The scary way Common Core test ‘cut scores’ are selected

You may have given no thought to the “cut scores” that are set for various tests, but they make all the difference in who passes and who fails. What exactly are cut scores?  The Educational Testing Service describes them this way:

Cut scores are selected points on the score scale of a test. The points are used to determine whether a particular test score is sufficient for some purpose.

Notice the word “selected.” Cut scores are selected based on criteria that the selectors decide have some meaning. Unfortunately, it is often the case that the criteria have no real validity in revealing student achievement, which is the supposed mission of the test — and that means the scores have no meaning either. This post, by award-winning Prinicipal Carol Burris of South Side High School in New York, explains all of this in chilling detail.

Burris has been doing a remarkable job of chronicling New York’s botched reform effort for some time on this blog.  (You can read some of her work here,  here, herehere,  here, and here.)  Her narrative is important beyond the boundaries of New York, because other states are also doing some of the same things in the name of school reform. Burris was named New York’s 2013 High School Principal of the Year by the School Administrators Association of New York and the National Association of Secondary School Principals, and in 2010,  tapped as the 2010 New York State Outstanding Educator by the School Administrators Association of New York State. She is the co-author of the New York Principals letter of concern regarding the evaluation of teachers by student test scores. It has been signed by thousands of principals teachers, parents, professors, administrators and citizens. You can read the letter by clicking here. 

By Carol Burris

“A fool with a tool is still a fool.  A fool with a powerful tool, is a dangerous fool.”  That is how school reform expert, Michael Fullan, describes the implementation of Common Core State Standards in the United States.   Although the idea of common standards based on high learning expectations is an appealing concept, Fullan believes that leading with testing and sanctions will cause harm, enough to result in the ultimate demise of the Common Core.

This is more than an academic argument, because what is happening right now affects children. Whether it is the “hammer” of accountability or the “drill” associated with standardized testing, our children are not benefiting from the tools of Race to the Top reform.

Among the more powerful tools is the “buzzsaw” — the Common Core tests’ cut scores — which classify and sort students by test performance.  These cut scores, sometimes designed to produce high rates of failure, create an urgency that undermines local control and forcefully imposes unproven reforms across states and the nation.  Nowhere is this more apparent than in New York State.

Let’s review how New York’s Common Core test cut scores came to be.

State Education Commissioner John King asked the College Board to “replicate research” to determine what PSAT and SAT scores predict first-year success in four-year colleges. The College Board was asked to correlate SAT scores with college grades to create probabilities of college success. You can read the report here.

Keep in mind that research shows that the SAT’s predictive power is only 22 percent. High school grades are a far better predictor of college success. The lack of validity of scores, without the context of grades, was not taken into consideration.

The New York study chose the following “probabilities” as the definition of college success:

* English Language Arts:  a 75 percent probability of obtaining a B- or better in a first-year college English course in a four-year college.

* Math: a 60 percent  probability of obtaining a C+ or better in a first-year math course in a four-year college.

Below is the rationale for choosing different criteria for the two courses, as stated in the report:

“Generally speaking, it is much easier to obtain a higher grade in firstyear creditbearing ELA courses than in firstyear creditbearing math courses”

Both the conclusions and the rationale raise serious questions. First, for which college math courses are we predicting success?  Some students begin with college algebra. Others begin with advanced calculus. Still others take courses in statistics. There are all kinds of college math courses of varying difficulty that serve as the first math course for students.  Second, is the State Education Department implying that we should have lower math cut scores because math college professors are tougher graders?  I think the 75 percent/B- probability was not used for math because, according to the College Board study, it is associated with a score of 710. That is a score that only 6 percent of all college bound seniors obtain.

Why did they choose the higher standard in ELA? If they had chosen the scores associated with 60 percent/C+ in Reading and Writing on the SAT (380, 360), the scores would have included nearly all test takers. That is how irrational the results of the study were.

After coming up with three scores — 540 in math, 560 in reading and 530 in writing– the College Board determined the percentage of New York students who achieved those SAT scores. Those percentages were used to “inform” the cut score setting committee.  As the committee went through questions, according to member Dr. Baldassarre-Hopkins, the SED helpers said,  “If you put your bookmark on page X for level 3 [passing], it would be aligned with these data [referring to the college readiness data],” thus nudging the cut score where they wanted it to be.

When the cut scores were set, the overall proficiency rate was 31 percent–close to the commissioner’s prediction.  The proportion of test takers who score 1630 on the SAT is 32 percent.  Coincidence?  Bet your sleeveless pineapple it’s not. Heck, the way I see it, the kids did not even need to show up for the test.

Meanwhile Kentucky, the other state with Common Core tests, used the benchmark scores that indicate college readiness on the ACT: 20 in reading and 19 in math.

So how did their students do?  All of the Common Core passing rates on the Kentucky Common Core tests (Kentucky Performance Rating for Educational Progress, or K-PREP) are higher than those of New York.  Are the kids in Kentucky smarter?  Are their teachers better?

That is unlikely.  The fourth- and eighth- grade 2013 National Assessment of Educational Progress scores of both states are very similar — both in average score and the percent scoring at or above proficient.  New York’s and Kentucky’s average scores on the 2013 NAEP math tests were 1 point apart in both fourth- and eighth-grade math, identical in fourth-grade reading and only 4 points apart in eighth-grade reading.

Yet, nearly 50 percent of all of Kentucky’s 3-8 students passed reading, while only 31 percent did in New York.  And over 40 percent of Kentucky kids passed math, as compared with New York’s 31 percent.  Pearson makes the tests for both states.  Could it be that in Kentucky schools are just better at teaching the Common Core?

Again there is no evidence of that. The reason for the discrepancy can be explained by the cut score benchmarks. Kentucky benchmarked to lower scores. The ACT scores that Kentucky used are equivalent  to a 500 on the Reading SAT, (a score obtained by 51 percent of all test takers), and a 500 on the Math SAT, (a score obtained by 55 percent of the test takers) .  The respective percentages for the New York SAT benchmark scores are 29 percent and 42 percent.

By picking different SAT or ACT “benchmark” scores, states can raise and lower the percentage of kids you want to pass.  Kentucky used the same “buzzsaw” as New York — they just put on a smaller blade.

So, why didn’t New York use the score of 500 on SAT subtests?   After all, the College Board has used 500 as its “college readiness predictor” for years. Here is my speculation.  If New York had, they would not have seen the dramatic 30 point drops the commissioner predicted.  Unlike Kentucky, New York had already adjusted the cut scores upward in 2010.

Here is the bottom line. There is no objective science by which we can predict future college readiness using grades 3-8 test scores.  You can, at best make assumptions, based on correlations, with score thresholds that are capricious.  To make college readiness predictions for 8-year-olds is absurd and unkind.

Later this week, New York children will face another round of testing, this time in mathematics.  Thousands will opt out, while the majority will try their best.  Their moms and dads will be disappointed by their low scores, which result from the machinations of the score setting process.

I suggest they do something worthwhile with the score reports when they get them in August — line the birdcage. Then parents should hug their kids and read them a great book, just for fun. It is time for the fools with the tools to move on.