There are reports on Capitol Hill that some progress is being made on the rewrite of No Child Left Behind, at least between the Republican and Democratic leaders of the Senate education committee. Sen. Lamar Alexander of Tennessee, the Republican committee chairman, and Democratic Sen. Patty Murray of Washington, recently issued a statement saying:
“During the last several weeks we have been working together to build the base for legislation to fix the problems with No Child Left Behind. We are making significant progress in our negotiations. We are aiming to consider and mark up legislation to fix the law during the week of April 13th.”
Is this good news? Arthur H. Camins, director of the Center for Innovation in Engineering and Science Education at the Stevens Institute of Technology in Hoboken, N.J., isn’t so sure, as he explains in the following post. Camins has taught and been an administrator in New York City, Massachusetts and Louisville, Kentucky. The ideas expressed in this article are his alone and do not represent Stevens Institute. His other writing can be found here.
By Arthur Camins
Bipartisan agreement makes for strange bedfellows as seeming opponents engage in an uncomfortable collective embrace of federal mandates of yearly, high stakes assessment. In the absence of obvious political alternatives some civil rights groups fear that without the harsh light of disaggregated data poor performance will be ignored. Those whose ideology bends their policy choices toward privatization see inevitable failure in the face unreasonable demands as a means to undermine faith in public education. Some are in the campaign contribution thrall of testing companies that stand to gain or loose billions from publically funded testing expenditures. Still others have an abiding faith in the power of rewards and punishments to compel behavior.
The continued focus of high-stakes assessment is the education equivalent of building inspectors requiring pipe wrenches to be used by all plumbers, framers, electricians, roofers and tile-setters, while bypassing the advice and needs of contractors and workers. For education, the sure losers are deep sustainable learning and equity.
Like building a home, creating an education system is a complex endeavor. As anyone who has undertaken it knows, significant remodeling may be even more challenging. When building or remodeling a complex system, it’s best to have a large, varied set of tools. Choosing the right tool for the right purpose is an obvious but often ignored principle- not least in education assessment policy. Pipe wrenches are great for large plumbing valves, but wreak havoc on smaller nuts. They have nasty teeth that rip and apply too much torque. Selection from a full set of open-ended wrenches would be a far better choice. Needle nose pliers are just the right tool for bending wires for electrical connections, but far too imprecise for removing the accidental building-related splinter. So it is with large scale standardized testing in education. The right tool can get the job done. The wrong tool fails and often causes damage.
In education, assessment is essential for real answers to former New York City Mayor Ed Koch’s question, “How’m I doin’?” While Koch’s question was a rhetorical crowd pleaser, in education we need honest, precise answers.
The key questions for selecting appropriate assessment tools are, “Assessment for what, when and whom?” The modern homebuilder has a set of design specifications with particular occupants in mind and a team of workers with diverse expertise in various aspects of home construction. Multiple tasks require different tools for different workers with different purposes and, of course, cooperation.
The same is true in education. When it comes to assessment, choosing the right tool depends on purpose, values and precision.
Let’s start with the big picture. Education has three equally important purposes: Preparation for students for life, work and citizenship.
The values principle of equity implies that the design of our education system should accommodate and address the diverse needs of all students. To be clear, equity as used here has two meanings: opportunity equity and lived equity. The former refers to what is often called a fair shot to move up the socioeconomic ladder. The latter refers to a shorter ladder, in which position on the lower rungs does not preclude access to a decent secure life, with adequate food, clothing, housing and health care– what we have come to expect of a middle class life. The United States has neither kinds of equity and needs both.
The precision principle suggests the need to develop and select a variety of tools to assess progress and success with respect to all of the purposes and components of an effective education system. To assess education’s how are we doing questions, we need subsystem precision, lest we make the education-equivalent mistake of using meter sticks when micrometers are needed.
With these purposes, values and precision in mind, here are some important assessment considerations:
Do the people involved at every level of education from students to state-level administrators have the resources they need to get their jobs done well? At the very least, resource constraints should be a variable in an accurate measurement model. Assessment tools that only measure the final result (itself a complex question) are clearly too far removed to answer the resources question well. In a well-functioning accountability system, measures of funding adequacy and equity are essential. Ensuring funding equity should be a federal role, but that does not appear to be a part of serious consideration for ESEA reauthorization proposals. It is especially important to federally monitor and ensure adequacy of resources for historically underrepresented groups and students targeted by special education law. Past experience demonstrates that leaving those functions to states and localities supports inequity that is damaging to the nation.
Equitable resources are essential, but they do not ensure equitable outcomes. While constitutionally, much of education decision-making authority in U.S. is delegated to the states, the interconnectedness of the nation clearly indicates that local outcomes are a national concern. Ineffective or poorly funded education in one state impacts another. The periodic National Assessment of Educational Progress (NAEP) serves to monitor outcomes across the states. The NAEP is not given to every student at every grade in every year. Instead, it is administered at the end of grade bands and uses the well-known statistical strategy of sampling. Politicians know this technique well. They rely upon it extensively when they do polling to gauge potential policy positions because querying every citizen is impractical and not needed to get the information they need. As a tool for fair state or large city level big-picture achievement monitoring, NAEP does the trick, but different non-comparable state-designed tests do not.
There are several justifications for yearly, grade-level testing in reading and mathematics. One is to shine a light on the performance of historically ignored subgroups. Another is that reading and mathematics are gateway subjects, mastery of which is essential for all learning. These are serious well-grounded concerns.
However, consequential assessment for these purposes has had predictably perverse consequences, without substantial benefit. For example, there has been a marked decline in time devoted to science, social studies and the arts, undermining student interest and motivation and instructional attention to the multiple purposes of education. It has also resulted in unethical behaviors such as special attention to children whose test scores are near performance-levels thresholds and outright cheating. Most significant, it has deflected political pressure away from the strategy that would have the greatest impact on students’ opportunity to learn– substantively and directly mediating both opportunity and lived inequity.
The quality of teaching is a national concern. Two of the critical requirements for effective instruction are availability of daily, actionable information regarding students’ progress and gaps in their learning and the expertise to interpret and use that information to move learning forward. Large-scale, state or national assessments are too distant and imprecise and not timely enough for this purpose. The best source of this information is well-designed daily student work. A shift in funding to prioritize this kind of assessment– often called formative– would be a terrific impactful investment– more like using a 1/8 “ wrench for fine tuning rather than a pipe-wrench– the right size tool applied with just the right amount of torque.
Unfortunately, there is still one assessment tool under consideration that only has a damaging purpose- evaluation of individual teachers based on the results of students’ performance on high-stakes tests (often called value-added masurement or VAM). Statisticians and psychometricians have consistently debunked this assessment tool as unreliable, unstable, lacking in precision. In addition, VAM fails as an improvement tool because it only targets the lowest performing minority of teachers. No country that has made significant education progress has done so using this sledgehammer-like tool. Its damage has been well documented– undermining professional collaboration among teachers, teaching to simplistic skills while ignoring other critical education dimensions, focusing on some children while ignoring others, and punishment of teachers based on faulty data. In some places it had led to the gross unfairness of evaluating teachers of non-tested subject by the reading and math test scores of students who they do not teach. Teacher evaluation mandates should be thrown out of the federal toolbox.
A far better teacher evaluation investment would be allocation of funds to school districts for professional development to enhance supervisors’ expertise to identify and support all teachers to use known effective instructional behaviors. This would have direct rather than distant impact.
The 2014 report of the National Research Council, Developing Assessments for the Next Generation Science Standards, provides suitable guidance for ESEA reauthorization:
We envision a range of assessment strategies that are designed to answer different kinds of questions with appropriate degrees of specificity and provide results that complement one another. Such a system needs to include three components:
1) assessments designed to support classroom instruction,
2) assessments designed to monitor science learning on a broader scale, and
3) a series of indicators to monitor that the students are provided with adequate opportunity to learn science in the ways laid out in the framework and the NGSS.
To be clear, the report does not address whether each of these assessment functions should be mandated and determined at the federal level. However, thinking about selecting the right tool for the right purpose does provide direction.
ESEA reauthorization should not:
- Mandate consequential state testing;
- Include requirements for student assessment-based teacher evaluation.
ESEA reauthorization should:
- Ensure funds to provide for and measure the attainment of equitable resources;
- Provide funds to locales to increase educator expertise in the use formative assessment strategies to improve daily learning.
It is past time for all supporters of equitable education for life, work and citizenship to call out No Child Left Behind with its high-stakes testing centerpiece as a failed Faustian bargain. Choosing the right tools for the right purposes is a common sense starting point.