The Washington PostDemocracy Dies in Darkness

School reform: What went wrong, what went right, and what we should do in the future

For years the United States has embarked on an effort to reform its public education system, a civic institution, that has been based on market principles and the belief that standardized testing is the best way to assess students, teachers, principals, schools, districts and states. The results? Not exactly what market reformers had hoped.

A new book edited by William J. Mathis and Tina M. Trujillo takes a look at, as they write in the post below, “what went wrong, what went right, and what we should do in the future.” The book is titled “Learning from the Federal Market-Based Reforms: Lessons for the Every Student Succeeds Act,” and was published by the National Education Policy Center, a think tank at the University of Colorado at Boulder. Mathis and Trujillo asked a number of scholars to assess key aspects of the reform agenda and they assembled the work in a smart, wide-ranging book.

Mathis is the managing director of the NEPC and the former superintendent of schools for the Rutland Northeast Supervisory Union in Brandon, Vermont. He was a National Superintendent of the Year finalist and a Vermont Superintendent of the Year. He currently serves on the Vermont State Board of Education and chairs the legislative committee. He has published or presented research on finance, assessment, accountability, standards, cost-effectiveness, education reform, history, and Constitutional issues. He also serves on various editorial boards and frequently publishes commentary on educational policy issues.

Trujillo is an associate professor at the University of California, Berkeley’s Graduate School of Education. She is a former urban public school teacher, school reform consultant, and educational evaluator. She uses tools from political science and critical policy studies to study the political dimensions of urban district reform, the instructional and democratic consequences of high-stakes testing and accountability policies for students of color and English Learners, and trends in urban educational leadership.

Here’s a piece by Mathis and Trujillo about the book and its findings.

By William J. Mathis and Tina M. Trujillo

Washington was euphoric. In a barren time for bi-partisan cooperation late in 2015, both Democrats and Republicans were happy to get rid of No Child Left Behind (NCLB).  The K-12 education law was almost universally excoriated as being a failure — particularly in that most important goal of closing the achievement gap. Looking at long-term trends from the National Assessment of Educational Progress, gains were seen in some areas but the achievement gap was stuck.  NCLB provided no upward blips on the charts.

Thus, it is stunning that the successor law, the Every Student Succeeds Act (ESSA) passed by Congress last December, is basically an extension of NCLB.  Fundamentally, ESSA maintains the same philosophy and direction. It is still a standardized test-driven system that is punitive in nature.

The main difference is that states are now responsible for designing the enforcement systems — which must be approved by the federal government. But states will not likely make many fundamental changes. They have invested heavily in their systems, as have local schools and districts.

Test-based accountability has been the law of the land for the past 30 years — which means that it is the only system that many educators have experienced.  Furthermore, vendors, textbook manufacturers, testing companies, consultants and the like have a strong bias toward protecting their investment — even while acknowledging that it didn’t work.

If we were serious about improving education and truly guaranteeing that all children were successful, we would have to do things differently than we did under NCLB. To figure out how things should be changed, we called on a collection of the nation’s most eminent scholars to address what went wrong, what went right, and what we should do in the future.  They provided us a broad array of ideas and consistent findings:

First, children who are hungry, suffering from malnutrition and live in substandard conditions are highly unlikely to score well on tests. We will never close the achievement gap until we close the opportunity gap. This also involves compensatory services for our most needy. While giving considerable lip service to the plight of poor children and children of color, we have not backed-up our rhetoric with our actions.

Inner-city children are consistently provided fewer school services while, on average, facing considerably greater family and community challenges. The 1965 Elementary and Secondary Education Act (of which NCLB and ESSA are the latest versions) has always been intended to address these disparities, but it has never been adequately funded. Meanwhile, vital companion social, educational and health services have also suffered from inadequate resources and fragmented coordination.

Second, test-based accountability does not improve learning. Psychologist B. F. Skinner taught us more than 60 years ago that negative reinforcement has unpredictable and undesirable consequences. Yet, we embarked on a path of test and punishment whose inevitable outcome was sadly predictable.

This fact is evident in decades of test-based reforms which, at best, show very modest results. In examining the evidence, the National Academies comes to the plain conclusion that we really do not know how to use test results to improve education. With more than two-thirds of test score variance coming from outside the schools, it is not possible to eradicate the effects of poverty with a new phonics program, no matter how well it is delivered.

Third, the various punitive consequences prescribed by the federal government failed for a number of obvious reasons. School turnarounds, where large proportions of teachers and administrators are fired and replaced on the assumption that they are responsible for low test scores, resulted in chaotic buildings that lacked orderly cultures or strong ties to their community. Further, the reserve of “more” capable teachers and administrators simply didn’t exist.  Despite politicians parading before alleged “miracle schools” that supposedly overcame all manner of obstacles, there is scant evidence that the turnarounds actually managed to provide, much less sustain, the promised breakthroughs.

Fourth, the promise of “market-based” reforms just didn’t pan out. The invisible hand of the market was to be the solution primarily through charters and privatizing schools. Even if we gave full weight to the market-based claims, these efforts fell far below what we staked out as our goal. A growing body of literature shows that charter schools do not perform better than traditional public schools and they segregate schools by race and by socio-economic status.

The Dilemmas — These four factors provide a compelling indictment of the market-based and test-based NCLB reforms. They are now to be carried forward under state flags.  This sets up three important questions.

Multiple Measures — There is common agreement on using more than standardized tests for accountability purposes. However, when voices as disparate as Linda Darling-Hammond and Paul Hill agree on the need for “multiple measures,” there is an obviously large amount of elasticity in the term. The problem is in defining what should be measured, how it should be tallied, and how multiple scores can be combined into one.

At this hour, there is considerable federal debate over developing a single measure. The challenge is that schools have many purposes and each would lend itself to a different way of measuring and weighing. While there are statistically sophisticated ways of attacking this problem, most of them are flawed; each method implies a different set and weighting of values.

The validity of the measures — The companion difficulty is trying to validly represent an important feature with an imperfect measure; for example, using a simple questionnaire to describe climate. Under NCLB waivers, states used from four to twenty-two different measures. What is a valid combination and weighting of these measures? Or does one exist? Should the math scores be double the ELA scores? Should they be divided by the attendance rate? Such decisions are central but are not empirical. They are based on our underlying values.

False scientism — We learned that evaluating teacher and preparation programs creates a false scientism by placing too much trust in too weak a measure. We also know that all things that are important cannot be measured and any evaluation of a school must involve qualitative decisions. Many economists have developed regression equations in attempts to model schools. These formulas invariably collapse due to measuring the wrong variables incorrectly. The result is less than impressive statistical strength which makes high-stakes and consequential decisions with this data too weak to justify its use.

Our scholars had a broad number of recommendations. Some first-order themes stood out:

  • The Opportunity Gap — The primary finding was that students must have opportunities, funding, and resources sufficient to meet what the state requires of them. There have been some 70 or so state adequacy studies and with very few exceptions, they have indicated we are not meeting the needs of students. (This includes studies by experts that typically represent the defense).
  • Privatization and market-model approaches have not proven to be effective. In cases where positive results have been found, the magnitude of the improvements has simply been too weak to provide the needed level of improvement.
  • Support rather than punish — State agencies had their most positive effects during the 1970s when they employed an assistance role in helping schools and districts. This capacity virtually disappeared as the reductionist mentality became ascendant.
  • Using Data Wisely — “Data-dashboards” have captured the fancy of many. Yet, as the National Academies noted, we do not yet know how to use test data wisely. We must exercise caution about making inferential causal leaps. We have a half-century of failed regression equations that should sober our thought.
  • Qualitative Evaluations of Schools — Review teams should be revisited and reconstructed. Statistical measures are necessary but the federal interpretation of ESSA goes too far in requiring only a single composite of evidence.
  • Excessive Testing — Fewer grades should be tested and the state assessment pilots in ESSA should be targeted toward alternative assessment structures. The weakness of value added scores and the limitations of vertical scaling should give us pause about their use in high stakes usage.
  • Embrace positive findings — Our authors did find a number of positive policies and features. These include, early education, child care, extending the school day and year, de-tracking, small class sizes and inter-agency cooperation.

This diverse nation and our common good require all students to be well educated. Yet, we have embarked on economic and educational paths that systematically privilege only a small percentage of the population. In education, we invest less on children of color and poor families. At the same time, we support a testing regime that measures wealth rather than providing a rich kaleidoscope of experience and knowledge to all.

And we do not hold ourselves responsible for the basic denial of equal opportunities.

[I]f schools are being held accountable for improving teaching and student learning, policymakers at all levels of the educational system, regional and state levels as well as the national level, should also be expected to support the capacity required to produce improved teaching and learning (p. 21).[i]

The greatest conceptual and most damaging mistake of test-based accountability systems has been the pretense that poorly supported schools could systemically overcome the effects of concentrated poverty and racial segregation by rigorous instruction and testing. This system has inadequately supported teachers and students, has imposed astronomically high goals, and has inflicted punishment on those for whom it has demanded impossible achievements.

Public schools can only succeed in achieving their democratic purpose of educating all children with all-around accountability. This means holding state and federal governments accountable for ensuring that children have legitimate, adequate and equitable opportunities to learn. Ultimately, a child denied opportunities will arrive at school with high needs, and a school without adequate capacity cannot effectively address those needs. No amount of testing and improvement plans can succeed absent a strong support system.

In a nation that prides itself on its achievements, the lack of opportunities provided to our neediest children is not morally justifiable. We must invest simultaneously in our economy, our society and our schools.

[i] Ryan, K.E., Gandha, T., & Ahn, J. (2013). School Self-evaluation and Inspection for Improving U.S. Schools? Boulder, CO: National Education Policy Center. Retrieved October 1, 2015 from