Jessie B. Ramey is the parent of two children in Pittsburgh public schools and a historian of working families, gender, race and U.S. social policy who teaches women’s studies and history at the University of Pittsburgh. She has decided to opt her children out of upcoming state-mandated tests in Pennsylvania known as the PSSAs for reasons not often cited by parents who have made the opt-out decision for their own children.
The opt-out movement is spreading around the country, with tens of thousands if not hundreds of thousands of parents deciding that the state-mandated tests being given to students are not fair to either the students or the teachers who will be evaluated by the test scores. While the percentage of parents opting out is small relative to the number of parents allowing their children to take the test, the movement has forced a national debate on the value of the tests and forced administrators and policymakers to address it.
In this post, Ramey provides her religious reasons for opting her children out of the tests. This first appeared on her Yinzercation blog, and I am republishing it with her permission. This is a letter she sent to Linda Lane, superintendent of Pittsburgh Public Schools; Lisa Augustin, director of assessment; Jamie Kinzel-Nath, Pittsburgh Colfax K-8 principal; and all of her children’s teachers.
April 10, 2015
Dear Dr. Lane and Ms. Kinzel-Nath:
Pursuant to Pennsylvania Code Title 22 Chapter 4, section 4.4 (d)(5) I am hereby exercising my right as a parent to have my children, ____________, excused from PSSA testing on the grounds of my religious beliefs. Please allow ___________ to pursue alternate educational activities such as a research project or volunteering in younger classrooms during testing.
I could stop my letter right there, as that is all that is legally required by the state in order to excuse our children from testing. However, as this is our third year writing such letters, I would like to explain the religious grounds we have for refusing to allow our children to be tested. Even though, under law, no state or school official is permitted to ask us about our faith nor require “proof” of our beliefs, I would like to share these religious reasons with you.
We belong to First Unitarian Church of Pittsburgh, a member of the Pennsylvania Interfaith Impact Network (PIIN), which is active in education justice. Every Sunday, we recite seven principles that unite Unitarian Universalists. Most of these principles are basic moral and philosophical statements shared by all of the world’s major religions. They reflect the common values of most faiths, from “love one another” and “do unto others,” to respect for the spark of the divine in each of us, and the ethical-humanist imperative to leave this world a better place. Please allow me to explain how each of these seven principles has led us to refuse high-stakes-testing for our own children, and on behalf of all children.
Every child is valuable – priceless – and has the human right to a rich, full education. Respecting the inherent worth of every child also means treating each student as an individual, and not a widget being produced in a factory. Standardized testing, tied to an ever more standardized common core curriculum, sorts students into categories (“below basic,” “basic,” etc.) There are serious consequences to this sorting and labeling (see below), but the underlying premise of this standardized high-stakes-testing is to compare and rank students – not to support the individual learning of each student.
This is clearly evident when schools use standardized, normed tests, which force all students into a bell curve, guaranteeing that a large proportion of the children will fail. To get that nice bell shape of test results, with exactly half of the children falling on the “below average” side of the curve, the tests are carefully designed with purposefully misleading questions. For instance, test makers will use tricky sound-alike answers to intentionally trip up English language learners, or culturally specific clues most easily decoded only by students from wealthy families. Pittsburgh is subjecting students to the normed GRADE test not once, but three times a year (a result of accepting state money that came with testing strings attached). Teachers have been reporting the problematic GRADE test questions for years, but the test-maker has not changed them because this “assessment” requires a set failure rate. In what way does this kind of standardized testing respect the inherent worth of our students? When students’ test scores are then displayed for all to see on “data walls” (an increasingly common practice in our schools), how does this respect the dignity of each child?
While advocates claim that high-stakes-testing will hold teachers and schools accountable for student learning and therefore promote equity, it often does the exact opposite by reinforcing inequality. High-stakes-testing labels our schools as “failures,” but never results in additional resources to actually help kids. Instead, “failing” schools are often targeted for closure. When you look at the pattern of school closures across the country – including here in Pittsburgh – you can see that districts have closed schools in predominantly black and brown neighborhoods, displacing some students multiple times. Our communities of color have been harmed the most, with places like Oakland and Hazelwood turned into education deserts without a single neighborhood public school.
Schools labeled as “failing” on the basis of student test scores are often targeted with other “reforms” that rarely help children. Our own beloved Colfax provides an excellent example of the “disruptive innovation” imposed on supposedly failing schools. Nine years ago when our family first started at Colfax, its large achievement gap had recently earned it a designation as a “turnaround school.” The district fired every single teacher and the principal then handpicked an entirely new teaching staff. The idea, of course, was that we had to get rid of the “bad” teachers and hire only “great” teachers and that would solve the problem of low test scores. Fast forward almost a decade and you can see that this didn’t work: Colfax still has one of the largest achievement gaps in the city (which is really an opportunity gap made highly visible by the presence of families from some of Pittsburgh’s wealthiest and poorest communities together in the same school).
During this same decade, Colfax students also experienced a relentless series of “reforms,” all aimed at increasing test scores. When we started, Colfax was a Spanish language immersion school, then we lost the extra language instruction to become an “Accelerated Learning Academy” focused on reading and math. We got an America’s Choice curriculum that was supposed to solve everything and added extra periods of reading. We got a longer school day and a longer school year. We got a Parent Engagement Specialist. Then we lost the curriculum, lost the extra time and days, and lost the parent specialist. The district changed to a 6 day week, so we could cram in extra reading and math periods, since these are tested subjects, resulting in a net loss of music, art, language, and physical education. With state budget cuts we lost more music and athletic programs, and we even lost our after school tutoring program aimed at those very students whose test scores continue to cause so much alarm. And class sizes ballooned to 30, sometimes 35 and more students.
Imposing constant churn and disruption on our most vulnerable students in the pursuit of higher test scores is not education justice. Worse, the relentless high-stakes-testing has served to re-inscribe inequality. We recently heard from Jon Parker, a Pittsburgh high school teacher, who explained what high-stakes-testing is doing to students’ sense of self worth in his classroom. Every year, he asks his students to write him a letter introducing themselves. In his class of struggling readers this year, over half of the students included their most recent PSSA rating as part of their introduction. They literally said things like, “I’ll work hard but I’m below basic.”
Mr. Parker explains, “the tragic message from our high stakes test environment is ‘you are your score.’ And if we tell a student he’s below basic regularly from the time he’s in kindergarten, what else would we expect of him? One of the stated goals of No Child Left Behind was to combat the ‘soft racism of low expectations.’ But instead it has created a vicious cycle of self-fulfilling prophecies. ‘You have failed in the past; you will fail forever.’ I cannot imagine where I would be if I had that school experience, but I can guarantee you I wouldn’t be here.”
Mr. Parker also examined the ways in which high-stakes-tests are used to exclude students from high-quality courses and programs. He gave the example of a young woman of color in his class right now with a 4.0 GPA – “one of the most well-rounded and motivated students I’ve ever had” – who will be excluded from taking the advanced math and science courses she would like to take next year solely because of a test score.
What’s more, Mr. Parker argued that if high-stakes-tests are meant to indicate which students need support so teachers can help them, they are miserably failing this most basic task. Instead, administrators and teachers makes lists of “bubble students” who are close to the passing mark and focus their energy on moving these students up to “proficient.” The students with the most needs, struggling at the very bottom, are passed over: “they are neglected, perpetuating what has probably been the whole of their educational experience. ‘You’re a failure; you’re not worth our time.’ Then we wonder why we have such disparity in opportunities, a lack of student or family buy-in, negative attention seeking behaviors (for which we then suspend students).”
So if our students who need the most help never get that help, where is the equity? If a young woman of color with 4.0 GPA wants to take advanced math and science classes but and can’t because of a single test score, where is the justice? If children now label themselves with their own test scores and literally believe themselves to be “below basic,” where is our compassion?
Part of accepting one another is recognizing that we each have unique gifts and strengths. We are not all the same. Some students excel in trombone or slam poetry, or are highly empathetic or fantastic story tellers: none of which gets measured by high stakes testing. I am concerned about the intellectual growth of our students as well as the nurturing of their individual spirits. I believe in real learning and more learning time for our children. I support quality assessments that help our children learn and provide meaningful information to teachers to help them meet the needs of individual students. I want tests, ideally designed by teachers, which align with the curriculum and give timely, informative results to parents and students.
As a scholar, I am committed to a free and responsible search for truth and I highly value data and evidence in that quest. We now have a mountain of evidence about the negative consequences of the high-stakes attached to testing, as well as the over-use and misuse of testing. To summarize, these are some of the high-stakes for students:
With all of that evidence that high-stakes testing is hurting students, changing their schools for the worse, and reducing real learning, why are we still giving so many standardized tests? Steve Singer, a teacher in the Steel Valley School District, points out that some tests can serve a political purpose. For instance, the DIBELS test, used to evaluate reading, is owned by Rupert Murdoch, and “cut scores are being artificially raised to make it look like more students are failing and thus our schools aren’t doing a good job.” Yet Mr. Singer explains that the DIBELS “doesn’t assess comprehension,” and “rewards someone who reads quickly but not someone who understands what she’s reading.” Also, he explains that, “focusing on pronunciation separate from comprehension narrows the curriculum and takes away time from proven strategies that actually would help [a student] become a better reader.”
My son’s experience with the DIBELS illustrates the way in which standardized tests can be used as gatekeepers, excluding even very high-achieving students from accessing appropriate programs. My son was a “late” reader (which is not really true: he learned to read when he was developmentally ready in the third grade, and became a voracious, wonderful reader). But when he was in second grade, we were told his DIBELS score was too low to allow him to take an accelerated math class. He had taught himself multiplication at the age of four and was bored out of his mind in class. But the teacher had her orders: students needed to be reading 100 words per minute or could not advance to anything else. During our conversation with her about this, she called our son over and said, “I notice that you spend a lot of time looking out the window, like you were just now. Why are you daydreaming?” To which he answered, “Well, I was thinking about how if you have a ball in your hand, and drop it, and it hits the floor but doesn’t come all the way back up, where did that energy go?” I kid you not. He was seven years old and this was his response. The teacher looked right at us and said, “But see? He’s not reading 100 words per minute.”
Ideally, teachers are able to use test scores as just one data point among many to determine what students need to support their learning. But the hyper-focus on testing – and accountability measures that hold teachers responsible for getting every student over developmentally-arbitrary thresholds – means that time and again students are not treated as whole, complex learners, but rather reduced to a single score.
Testing advocates tell us that we must test every child, every year in order to identify inequality and drive reform (something no other high-education-achieving nation in the world does). But we have ample evidence from education researchers that high stakes-testing is not improving schools. Over 2,000 education researchers recently sent an open letter to the Obama administration and Congress: citing reams of data, the researchers wrote, “we strongly urge departing from test-focused reforms that not only have been discredited for high-stakes decisions, but also have shown to widen, not close, gaps and inequities.” The letter went on to quote evidence at length from a new policy memo from the National Education Policy Center, which effectively summarizes a “broad research consensus that standardized tests are ineffective and even counterproductive when used to drive educational reform.”
Evidence also shows serious problems with using high-stakes-testing to evaluate and rate schools. For example, a detailed analysis of the state’s new School Performance Profile (SPP) rating system found that – despite its claim to use “multiple measures” to evaluate schools and teachers – 90% of the calculation is based on high-stakes standardized tests. Yet “these measures are closely associated with student poverty rates and other out- of-school factors.” In other words, the tests are very good at measuring one thing: a family’s socio-economic status. Even the much-touted Pennsylvania Value-Added Assessment System (PVAAS) component of the score, which is supposed to calculate projected student growth while controlling for out-of-school factors, instead strongly correlates with poverty. The report raises “questions about whether the measures are a valid and reliable measure for purposes of school accountability.” In essence, schools are being held accountable, not for what students learn, but for the poverty level of the families they serve.
Similarly, teachers are being evaluated on the basis of the test scores of their students. This is an invalid use of data, violating a basic principle of assessment, since those tests were never designed to measure teacher effectiveness. You can’t take a test created to measure one thing and use it to measure another. Nevertheless, the entire teacher evaluation system is built on just this assumption. In fact, the Value Added Model (VAM) used to evaluate teacher “effectiveness,” assumes that student test scores are the result of a specific teacher, independent of all other factors. Yet the American Statistical Association (ASA) released a report last spring strongly warning about the limitations of VAM models, explaining, “Most VAM studies find that teachers account for about 1% to 14% of the variability in test scores” and that “Ranking teachers by their VAM scores can have unintended consequences that reduce quality.” The statistical researchers concluded, “This is not saying that teachers have little effect on students, but that variation among teachers accounts for a small part of the variation in scores. The majority of the variation in test scores is attributable to factors outside of the teacher’s control such as student and family background, poverty, curriculum, and unmeasured influences.”
My son’s situation reveals how inappropriate the entire VAM system can be. He is now several years ahead in math and takes his class at the high school each morning, before returning to Colfax for the rest of the day. However, the state would require him to take a PSSA several grade levels below where he is currently working. In what way would this assess his actual learning this year? This test is clearly not about helping my son in any way: it’s about evaluating his teacher. But if he scores at the very top of the PSSA, as he is bound to do, he is simply demonstrating the ceiling effect – there is no way to “show growth” for this student. Yet his teachers are accountable for the “growth” in each student’s test score. Furthermore, which teacher should we hold accountable for his score – the math teacher at Colfax who does not even have him in school this year? His math teacher at the high school who is not teaching him the material covered on the PSSA?
The American Education Research Association and the National Academy of Education released a report showing that VAM models are highly unstable: teachers rated highly effective one year, are frequently rated ineffective the next. Their ratings also differed substantially between classes taught in a single year. The report also confirmed that teachers’ VAM ratings were significantly affected by the demographics of the students they taught: even when VAM calculations tried to account for this, teachers’ scores were negatively impacted by working with poor students, English language learners, and students with special education needs. Finally, this report demonstrated that VAM ratings “cannot disentangle the many influences on students progress” and stated “most researchers have concluded that VAM is not appropriate as a primary measure for evaluating individual teachers.”
Yet as we place more and more emphasis on holding teachers, principals, and entire schools accountable for student test scores, we have seen a plague of adult cheating scandals erupt across the country. We should not be surprised, since Campbell’s Law in social science states that the more a quantitative measurement is used to make decisions, the more subject it becomes to corruption and the more likely it is to corrupt the thing it was supposed to measure. This is exactly what has happened, with the conviction of 11 former teachers in Atlanta this week who are now facing 5-20 years in prison for changing answers on student tests to raise scores. The superintendent of El Paso, Texas is currently in prison for taking low-performing students out of classes in order to increase the district’s test scores. In Ohio several cities apparently listed low-performing students as “withdrawn” to remove their scores from school totals. Some charter schools are well known for the “charter dump,” pushing students out just before testing season in order to inflate their test scores (sending students back into traditional public schools, where their new teachers will be held accountable for their learning). In Washington D.C. former superintendent Michelle Rhee – now the darling of the corporate reform movement who is famous for publicly firing a principal and massive school closures – oversaw her own “Erasure-gate” but was never held accountable. And right here in Pennsylvania our own former state Secretary of Education, Rom Tomalis, was caught both lying and cheating about student test scores (and then went on to occupy a ghost-job in the state capitol, making $140,000 a year but not showing up for work).
So why are we doing this? Why are we using our children’s test scores to feed a teacher evaluation system that not only doesn’t work, but actually harms teachers who work with our most vulnerable children? Finally, this Unitarian principle requires a commitment to a responsible search for truth, which means we have to be willing to examine the consequences of our own seeking. What if the collection and use of data on student achievement, as measured by test scores, is actually causing harm?
I am exercising my right of conscience by refusing to allow my children to take these tests. Our family cannot and will not be complicit in a system that we see harming others and damaging our common good.
High-stakes testing has also interfered with the democratic process. In many cities that lack democratically elected school boards, mayoral appointees have used high-stakes testing to label schools as failures and then moved to close them in unprecedented waves. Chicago is still reeling from the mass closure of 50 schools in 2013, almost entirely in communities of color. In cities like Philadelphia and New York, state or mayoral control has resulted in the privatization of public schools, handing over large numbers to private charter operators. Where is the democratic process when parents and communities no longer have a voice in public education and what is best for their children? When hedge fund managers are pouring enormous amounts of money into local school board races across the country to stack the deck in favor of privatization? When private charter operators are some of the biggest political donors in the state and refuse to comply with Pennsylvania’s sunshine open-records laws?
Pennsylvania’s new Keystone exams pose a particular concern for education justice, as they threaten to fail enormous numbers of poor students and students of color, preventing them from graduating (one of the highest stakes of all for students). The Pennsylvania NAACP has demanded the removal of the Keystones as graduation requirements, calling the use of these tests a “present day form of Eugenics.” With pass rates last year at some impoverished schools in the single digits, how will this form of high-stakes-testing create justice for all? And where there is no justice, there is no peace.
In a letter to the PA Department of Education, the NAACP wrote, “Attaching the Keystone Examinations to graduation is clearly based on the idea that it is possible to distinguish between superior and inferior elements of society through selective scores on a paper and pencil test. … Pushing masses of students out of high school without a diploma will create a subculture of poverty comprised of potentially 60 percent of our young citizens.” The letter uses strong language to object to the impact of high-stakes-testing on our most vulnerable children, including: “human rights violation…unspeakable horror…holocaust on our youth and society…life-long trauma… a system of entrapment for the youth of Pennsylvania…depraved indifference…deficient in a moral sense of concern…lacks regard for the lives of the children who will be harmed, and puts their lives and futures at risk…lynching of our own young.”
If we are serious about the goal of education justice, how can we ignore the impact these tests will have on an entire generation of children denied diplomas, with life-long consequences? Where is their liberty and their freedom?
To me, this principle evokes Martin Luther King’s famous quote, “Injustice anywhere is a threat to justice everywhere.” We are all connected – in an interdependent web of existence – and the oppression and harm caused to other people’s children, causes harm to all of us. We are all harmed by allowing oppression and oppressive systems to continue.
It doesn’t have to be this way. This entire system is only about 15 years in the making. Other countries that we admire greatly for their highly effective education systems do not test like this. If researchers need data to compare we could test sample groups of students, rather than every child. We could test every few years, instead of every year. We could remove the high-stakes for kids and teachers, and go back to using assessments to measure student learning, with the goal of helping students. We could admit that our most vulnerable students – our students living in poverty, our English language learners, our students with special education needs – don’t need more testing, but rather smaller class sizes; a rich, engaging, culturally relevant curriculum; and well supported teachers with adequate resources.
Jessie B. Ramey, Ph.D.