This was written by Carol Corbett Burris, principal of South Side High School in New York. She was named the 2010 New York State Outstanding Educator by the School Administrators Association of New York State.
By Carol Corbett Burris
I went to see “Rise of the Planet of the Apes” the other night. After the first ten minutes, everyone in the theatre knew things would not end well — at least not for the humans. The well-meaning protagonist ignores the warning signs that something is just not right with his experimentation. The damage is compounded when the entrepreneur smells the profit. The doomed experimentation accelerates and chaos ensues.
The theme is a familiar one. It is found in Robert Louis Stevenson’s “Strange Case of Dr Jekyll and Mr Hyde” and Mary Shelly’s “Frankenstein.” The best of intentions falls prey to hubris or desire, blinding the man of science to the warning signs that things are not working as they should. The lessons to be learned are always the same — proceed with caution, analyze your effects, and beware of the unintended consequences.
Not bad advice, I think, for the crowd that shouts, “no excuses.” Before you go to scale, evaluate what you are doing to see if it is working. Yet the obsession of reform by test scores blindly marches on. The most recent example is the evaluation of teachers by test scores, with Pennsylvania being the most recent state to jump on board.
One might understand the rush to use test scores for evaluation if there were research that showed it improved student learning. However, there is no research that indicates that test based evaluations will improve teaching or learning at all. As Kevin Welner and I point out in our letter to Education Secretary Arne Duncan, during the two years that IMPACT has been in place in Washington D.C., elementary student scores have gone down, and the strong correlation that you would expect to see between observations of teaching and test scores is not there.
While the jury is still out, doesn’t it make sense to proceed with caution?
Although it does not have the same “take-no-prisoners” appeal as IMPACT, the jury is coming in for the teacher evaluation system of Cincinnati, which does not use test scores and focuses on teacher evaluation through observations. There is evidence that this method is linked to both increases in student achievement and better teaching.
Because test scores are not part of the evaluative model, the tension between the competing goals of drill for the test and good teaching does not pull teachers in two directions. In models that focus on the observation of teaching, student scores are used instead to examine the efficacy of teaching behaviors, which not only evaluates teachers but helps us all understand what teaching behaviors are most likely to result in even higher student achievement. From that data, we can refine our observation and evaluation systems to further increase student learning. A very similar model has been effectively used in Montgomery County Maryland for over a decade.
I find it astounding that taxpayers are continuing to spend millions of tax dollars on test development, implementation, test scoring and evaluation systems associated with testing at a time when our schools are going broke. We have experimented with test based accountability for a decade and we have seen no growth in student achievement that justifies its continuation
However, rather than pulling back, we are ramping up as more money is spent to develop new tests to measure college readiness. If your state is participating in the PARCC consortium (Partnership for Assessment of Readiness for College and Careers) for example, public school students will soon take yearly exams (several per year) in English language arts and mathematics to measure their progress towards college readiness, culminating with a college readiness score in grade 11.
Is there research to back up the college readiness score? There is not. California has used grade 11 college readiness testing (called the Early Assessment Program, or EAP) on a voluntary basis since 2004. However, according to Education Week, “Even some of the program’s admirers, though, are frustrated by the lack of key data about its impact and worry that execution of the EAP might not fulfill its promise.” Despite large numbers of participants and years of implementation, college remediation rates in California are not decreasing. It has not made a discernible difference.
That should not be a surprise. Any parent who has spent big bucks on SAT prep tutoring does not do so because they want to better prepare their children for college. They do it to pump the score so that their child gets into their first-choice school.
The research has been clear for years. What has the greatest impact on success in college is the rigor of high school courses taken by a student. In his study entitled “ Answers in the Tool Box: Academic Intensity, Attendance Patterns and Bachelor’s Degree Attainment,” Cliff Adelman refers to the “academic resources” that strongly affect college completion. Of the three components Adelman included in academic resources — test scores, class rank/GPA, and curriculum — the curriculum students studied contributed more than either of the other two factors. Curriculum contributed 41% to college completion, as compared with 30% for test scores and 29% for class rank/GPA.
Want to make students college ready? Build and implement rigorous curriculum. In many ways, the Common Core State Standards are getting it right. How tragic it is that Common Core initiative will likely be ruined as it becomes enmeshed with more testing tied to teacher evaluation systems and school sanctions.
I am not opposed to testing. It is critical to measure learning. Good tests, like the exams of the International Baccalaureate, can have a backwash effect that improves instruction and enriches student learning. As a principal I use test scores to modify curriculum, and to help teachers improve instruction. I never hesitate to include student test scores in conversations with faculty and I use them to find out which teachers might need more intensive support.
But I also know their limitations, and I have seen the harmful effects on kids caused by far too many poorly constructed Regents exams.
It is time to pull back from the testing mania and evaluate the effects of high stakes testing before we increase its use. Why not start by estimating the tax dollars spent on testing and sanctions and compare it to the learning gains attained? Taxpayers should know. I find it curious that market-based reformers have never asked that question.
The first ten minutes have become the first ten years, and educators know things will not end well — at least not for the students.
Follow The Answer Sheet every day by bookmarking http://www.washingtonpost.com/blogs/answer-sheet. And for admissions advice, college news and links to campus papers, please check out our Higher Education page. Bookmark it!