The Washington PostDemocracy Dies in Darkness

Confirmed: Standardized testing has taken over our schools. But who’s to blame?


Who’s to blame?

A new two-year study on testing in U.S. big-city public schools reveals what many students, parents and teachers have been screaming about for years: Kids take too many mandated standardized tests. What’s more, there is no evidence that adding testing time improves student achievement, it says.

The average student in America’s big-city public schools takes some 112 mandatory standardized tests between pre-kindergarten and the end of 12th grade — an average of about eight a year, the study says. That eats up between 20 and 25 hours every school year, the study says. As for the results, they often overlap. On top of all that are teacher-written tests, sometimes taken by students along with standardized tests in the very same subject.

[Five reasons standardized testing isn’t likely to let up]

In 66 school systems studied by the Council of the Great City Schools, a  nonprofit organization that represents the largest urban public school systems in the country, students in the 2014-15 school year sat over 6,500 times for tests, taking tests with 401 different titles. (See all the major findings below.)

High-stakes standardized testing has become a hallmark of modern school reform for well over a dozen years, starting with the use of these exams in the 2002 No Child Left Behind law to hold schools “accountable.” The stakes for these exams were increased with President Obama’s $4.3 billion Race to the Top funding competition, in which states could win federal education funding by promising to undertake specific reforms — including evaluating teachers by test scores and adopting “common standards.”

In early 2012, Robert Scott, a Republican who was then the commissioner of education in Texas, rocked the education reform world when he declared that school accountability systems based on high-stakes standardized tests  had led to a “perversion” of what a quality education should be and he called “the assessment and accountability regime” not only “a cottage industry but a military-industrial complex.”

The new study, which looks at the testing practices of the big-city school systems, was ordered by the council’s board of directors in 2013 amid protests about the exams, which led to a national “opt-out” movement in which students refused to take standardized tests. The Common Core State Standards initiative — with new federally funded tests — only fueled the protests, which included teachers refusing to administer the tests and superintendents calling for a change in their states’ testing requirements.

The anti-testing rebellion became so loud that even some of the strongest proponents of testing started to say it was time to scale back on the number of assessments students must take. In August 2014, Education Secretary Arne Duncan said finally that he “shared” teachers’ concerns about too much standardized testing and test prep, and that he believed “testing issues today are sucking the oxygen out of the room in a lot of schools.”

This past spring, 20 percent of students in New York state opted out of mandated standardized tests, the scores of which are used to evaluate teachers through highly controversial assessment methods.

Stacie Starr — a ninth grade intervention specialist in Ohio who was selected as “Top Teacher” last year in a national search by the popular television show “Live with Kelly and Michael” —announced that she was quitting because teachers now have to spend too much time teaching kids to take and pass tests.

And this month, Mark Pafford, House Minority Leader in the Florida legislature, said he supports parents who are opting their children out of Florida’s accountability testing system, saying, “I applaud those parents who have the courage to do that. They’re saying something. They’re doing something.”

[Top teacher quitting: I can’t ‘drill ’em and kill ’em]

The study, interestingly, doesn’t mention the 20 percent rate in New York. It does, however, attempt to place blame on the testing mess — and it spreads it around liberally:

Much of this backlash has been aimed at local school systems, but evidence in this report indicates that culpability for our assessment system also rests at the doorsteps of Congress, the U.S. Department of Education, the states, and test publishers and vendors.

Michael Casserly, the executive director of the Council on the  Great City Schools, is quoted in a press release saying:

“Everyone has some culpability in how much testing there is and how redundant and uncoordinated it is – Congress, the U.S. Department of Education, states, local school systems and even individual schools and teachers. Everyone must play a role in improving this situation.”

Everyone is to blame? Really? Individual schools and teachers forced kids to take standardized tests? They passed laws or set up funding contests that required or promoted the use of of standardized test scores to evaluate teachers?

For years now state and federal policymakers have known that kids are being saddled with too many mandated standardized tests. It wasn’t until teachers and parents and principals and superintendents began making strong waves that they finally started to agree. Blaming individual schools and teachers seems way beyond the point.

The new report included preliminary recommendations that call for retaining current annual tests in core subjects but eliminating redundant or low-quality tests. The full recommendations are below.

Here are the key findings taken from the report, which, again, will come as no surprise to many in the education community:

Based on the Council’s survey of member districts, its analysis of district testing calendars, interviews, and its review and  analysis of federal, state, and locally mandated assessments, this study found:
* In the 2014-15 school year, 401 unique tests were administered across subjects in the 66 Great City School systems.
* Students in the 66 districts were required to take an average of 112.3 tests between pre-K and grade 12. (This number does not include optional tests, diagnostic tests for students with disabilities or English learners, school-developed or required tests, or teacher designed or developed tests.)
* The average student in these districts will typically take about eight standardized tests per year, e.g., two NCLB tests (reading and math), and three formative exams in two subjects per year.
* In the 2014-15 school year, students in the 66 urban school districts sat for tests more than 6,570 times. Some of these tests are administered to fulfill federal requirements under No Child Left Behind, NCLB waivers, or Race to the Top (RTT), while many others originate at the state and local levels. Others were optional.
* Testing pursuant to NCLB in grades three through eight and once in high school in reading and mathematics is universal across all cities. Science testing is also universal according to the grade bands specified in NCLB.
* Testing in grades PK-2 is less prevalent than in other grades, but survey results indicate that testing in these grades is common as well. These tests are required more by districts than by states, and they vary considerably across districts even within the same state.
* Middle school students are more likely than elementary school students to take tests in science, writing, technology, and end-of-course (EOC) exams.
* The average amount of testing time devoted to mandated tests among eighth-grade students in the 2014-15 school year was approximately 4.22 days or 2.34 percent of school time. (Eighth grade was the grade in which testing time was the highest.) (This only counted time spent on tests that were required for all students in the eighth grade and does not include time to administer or prepare for testing, nor does it include sample, optional, and special-population testing.)
* Testing time in districts is determined as much by the number of times assessments are given during the school year as it is by the number of assessments.
* There is no correlation between the amount of mandated testing time and the reading and math scores in grades four and eight on the National Assessment of Educational Progress (NAEP).
* Test burden is particularly high at the high-school level, although much of this testing is optional or is done only for students enrolled in special courses or programs. In addition to high school graduation assessments and optional college-entry exams, high school students take a number of other assessments that are often mandated by the state or required through NCLB waivers or Race to the Top provisions. For instance—
* In 71.2 percent of the 66 districts, students are required to take end-of-course (EOC) exams to fulfill NCLB requirements—sometimes in addition to their state-required summative test.
* Approximately half of the districts (46.8 percent) reported that EOC exams factor into their state accountability measures.
* In 47 percent of districts, students are required by their states to take career and technical education (CTE) exams if they are taking a CTE course or group of courses. This requirement can also be in addition to state summative exams and EOC tests.
* About 40 percent (37.9 percent) of districts report that students—both elementary and secondary—are required to take exams in non-NCLB-tested grades and subjects. These are sometimes known as Student Learning Objective (SLOs) assessments or value-added measures.
* Urban school districts have more tests designed for diagnostic purposes than any other use, while having the fewest tests in place for purposes of international comparisons.
* The majority of city school districts administered either PARCC or SBAC during the past school year. Almost a quarter (22.7 percent) administered PARCC assessments and 25.8 percent administered SBAC assessments in spring 2015. Another 35 percent administered the same statewide assessments in reading and math as they did in 2013-2014 (e.g., Texas, Virginia). And 16.7 percent of districts administered a new state-developed college- and career-ready (CCR) assessment (e.g., Georgia, Florida). In other words, there were substantial variations in state assessments and results this past school year.
* Opt-out rates among the Great City Schools on which we have data were typically less than one percent, but there were noticeable exceptions.
* On top of state-required summative exams, EOCs, SLOs, graduation tests, and college-entry exams, many districts (59.1 percent) administered districtwide formative assessments during the school year. A number of districts (10.6 percent) administered formative Student Testing in America’s Great City Schools assessments mandated by the state for some students in some grades and administered their own formative assessments for other students and grades. Almost half of the districts using formative assessments administered them three times during the school year.
* Some 39 percent of districts reported having to wait between two and four months before final state test results were available at the school level, thereby minimizing their utility for instructional purposes. In addition, most state tests are administered in the spring and results come back to the districts after the conclusion of the school year.
* The total costs of these assessments do not constitute a large share of an average urban school system’s total budget.
* There is sometimes redundancy in the exams districts give. For example, multiple exams are sometimes given in the same subjects and grades to the same students because not all results yield data by item, grade, subject, student, or school—thereby prompting districts to give another exam in order to get data at the desired level of granularity.
* In a number of instances, districts use standardized assessments for purposes other than those for which they were designed. Some of these applications are state-recommended or state-required policies, and some originate locally.
* The findings suggest that some tests are not well aligned to each other, are not specifically aligned with college- or career-ready standards, and often do not assess student mastery of any specific content.
* According to a poll of urban public school parents administered by the Council of the Great City Schools in the fall of 2014, respondents had very mixed reactions towards testing. For instance, a majority (78 percent) of responding parents agreed or strongly agreed that “accountability for how well my child is educated is important, and it begins with accurate measurement of what he/she is learning in school.” Yet this support drops significantly when the word “test” appears.
* Parents respond more favorably to the need for improving tests than to references to more rigorous or harder tests. Wording about “harder” tests or “more rigorous” tests do not resonate well with parents. Parents support replacing current tests with “better” tests.
* Finally, survey results indicate that parents want to know how their own child is doing in school, and how testing will help ensure equal access to a high quality education. The sentence, “It is important to have an accurate measure of what my child knows.” is supported or strongly supported by 82 percent of public school parents in our polling. Language about “testing” is not.

Here are the full recommendations in the report:

First, the nation’s urban public schools administer a lot of tests. The average student takes roughly 112 tests between pre-K and grade 12. At this point, there is a test for almost everything. For instance, districts have multiple tests for predictions, promotions, diagnostics, accountability, course grades, and the like. The benefit of this is that assessments provide the nation’s schools with the tools by which to gather objective data, determine whether they are making progress, and diagnose student needs. Moreover, standardized testing has allowed the nation to shine a light on major inequities under which students of differing racial, language, and income groups struggle. The flip side of this coin is that tests are not always very good at doing what we need them to do, they don’t tell us everything that is important about a child, and they don’t tell us what to do when results are low. This occurs for a variety of reasons: Data come too late to inform immediate instructional needs; teachers aren’t provided the professional development they need on how to read, interpret, and make use of the results in their classrooms; teachers and administrators don’t trust the results, believe the tests are of low quality, or think the results are misaligned with the standards they are trying to teach; or the multiple tests provide results that are contradictory or yield too much data to make sense of. The result is that the data from all this testing aren’t always used to inform classroom practice. In addition, some students fail to see the multitude of tests as important or relevant, and they do not always put forward their best efforts to do well on them.
Second, students spend a fair amount of time taking tests, but the extent of it really depends on the state, the district, the student’s grade level, and their learning needs and aspirations. It was clear from our research that the time needed—on average—to take mandatory tests amounts to about 25 hours or so or between four and five days per school year—about 2.34 percent of a typical 180 day school year. This is not a large portion of a school system’s total instructional time. However, in practice, testing time can be divided over more than four or five days, and additional instructional time may be lost in downtime (e.g., state NCLB exams may be given in sections with one subject taking multiple half-days). The total can eat into teachers’ and students’ time, particularly if one also takes into account the time necessary to administer the tests and prepare for them. Moreover, much of this testing stacks up in the second half of the school year in a way that makes the second semester seem like one long test.
Third, there is considerable redundancy in the tests that some school systems administer and that some states require. For instance, it was not unusual for school systems to administer multiple summative exams towards the end of the school year that assess student attainment in the same subject. We found this circumstance in districts that gave multiple formative exams to the same students in the same subjects over the course of the year. And we found districts that were giving both summative exams and EOC tests in the same subjects. There is little justification for this practice; it is a waste of time, money, and good will.
Fourth, the vast majority of tests are aligned neither with new college- and career-ready standards nor with each other. We have seen numerous examples where districts gave lots of tests, yielding lots of numbers, but found that they were not anchored to any clear understanding of what the nation, states, or school districts wanted students to know or be able to do in order to be “college- and career-ready.” The result is a national educational assessment system that is incoherent and lacks any overarching strategy. Moreover, we think it is worth noting that most tests that schools administer don’t actually assess students on any particular content knowledge.
Fifth, the technical quality of the student learning objectives (SLOs) is suspect. It was not within the scope of this study to review the technical quality of all tests that our school systems give, but it was clear to the study team that the SLOs often lacked the comparability, grade-to-grade articulation, and validity that one would want in these instruments. It was also clear that some districts like these assessments because they help build ownership among teachers in the testing process, but one should be clear that the quality of these tools is uneven at best.
Sixth, it is not clear that some of the tests that school districts administer were designed for the purposes for which they are used. The most controversial example is the use of state summative exams to evaluate school district staff when most of these tests were designed to track district and school progress, not individual staff-member proficiency. The Council would argue that test results should play a role in the evaluation of teachers and staff, but gains or losses on these instruments alone cannot be attributed solely to individual teachers or staff members. Still, the failure of these instruments to perform this evaluative role should not be reason not to hold people responsible for student outcomes.
Seventh, the fact that there is no correlation between testing time and student fourth and eighth grade results in reading and math on NAEP does not mean that testing is irrelevant, but it does throw into question the assumption that putting more tests into place will help boost overall student outcomes. In fact, there were notable examples where districts with relatively large amounts of testing time had very weak or stagnant student performance. To be sure, student scores on a high-level test like NAEP are affected by many more factors than the amount of time students devote to test taking. But the lack of any meaningful correlation should give administrators pause.
Eighth, the amount of money that school districts spend on testing is considerable in absolute dollar terms, but—like the amount of testing time—it constitutes a small portion of a school district’s overall budget. The districts on which we have data will typically spend only a small percentage of their district budget on testing, not counting staff time to administer, score, analyze, and report test results. But the more tests local school systems add to what the federal and state governments require, the more expensive it will be for the district.
Finally, parents clearly want to know how their children are progressing academically. They want to know how they compare with other children, and they want accurate measures of whether their children are on track to be successful in college or careers. Most parents probably have little sense of what the metrics of test results are or how to read them, but they do want to know how their children are doing. Our data indicate that parents believe strongly in the notions of accountability for results and equal access to high quality instruction and educational opportunities, but do not necessarily react positively to the language used to describe testing or changes in testing.
One of the other things that was clear from the analysis conducted by the Council of the Great City Schools is that many urban school systems have begun to rethink their assessment systems to make them more logical and coherent. They have also begun to curtail testing where it is not necessary or useful.
The Council is committed to two things: (1) It will continue to track what our member urban school systems are doing to improve and limit student testing, and (2) the organization is determined to articulate a more thoughtful approach to building assessment systems. Urban school districts generally believe that annual testing of students is a good idea, particularly in a setting where we are working hard to improve student achievement, but the current assessment regime needs to be revised.
The Council recommends the following preliminary steps—
For federal and state policymakers—
1) Retain Congressional requirements for states to test all students in reading and math annually on the same tests statewide in grades three through eight and once in high school. These annual tests provide a critical tool for gauging student achievement on a regular basis. But charge states with lowering the amount of time it takes to return assessment results to districts and schools.
2) Revisit or clarify the U.S. Department of Education’s policy on having student test scores for every teacher’s evaluation and the requirement for Student Learning Objectives in untested grades and subjects.3) Expand the U.S. Department of Education’s regulations to include a one-year exemption for testing recently arrived English learners with beginning levels of English proficiency.
4) Charge the U.S. Department of Education and states with providing and more broadly circulating guidelines on accommodations for students with disabilities who are taking ELP assessments.
5) Establish consistency from year to year in the assessments that states develop and require, particularly those tests used for accountability purposes.
6) Refrain from applying caps on testing time without also considering issues of quality, redundancy, and testing purposes.
For district leaders–
7) Review the entire portfolio of tests that the district gives in order to identify areas where there are redundant assessments. Begin curtailing tests that yield similar results but require additional time.
8) Ascertain the technical quality and usage of the tests the district is administering. Begin scaling back on assessments that do not meet professional standards and are not being used for the purposes for which they were designed.
9) Review all tests to gauge whether they are aligned to state and district standards—and to each other. If they are not aligned to a standard or benchmark your district has embraced, make sure you understand what the tests are anchored to and what they are actually measuring.
10) Revisit assessments, including assessments used for the identification of students for gifted and talented programming to ensure that they are not linguistically, culturally, or racially biased.
11) Determine whether or not your portfolio of district assessments is presenting leaders, staff, and teachers with a clear and coherent picture about how students in the district, including students with disabilities, ELLs, and ELLs with disabilities, are doing. Assessments that do not add sufficient detail to that picture might be phased out.
12) Pursue assessments strategically that can serve multiple purposes and could replace multiple tests that are currently being given.