The Washington PostDemocracy Dies in Darkness

Why Americans should not panic about international test results

U.S. math literacy on the international PISA exam in 2015 was below the international average.
Placeholder while article actions load

Here we go again. The Organization for Economic Cooperation and Development has just released the latest results from the test known as PISA, or Program for International Student Assessment, and guess what? American 15-year-olds who took the test in reading, math and science didn’t do well.

On the world stage, U.S. students fall behind

In fact, U.S. students have never done well — not in the history of international tests, including when the American public education system wasn’t under attack by reformers as it is now. That won’t stop people from saying the sky is falling over the results of a standardized test, especially one that many critics say is flawed.

In 2014, dozens of researchers and academics from around the world wrote an open letter to PISA director Andreas Schleicher, urging him to suspend administration of PISA until a new exam can be created. They cited a number of reasons, including that policy was being set based on PISA test scores, and that PISA regime “with its continuous cycle of global testing, harms our children and impoverishes our classrooms, as it inevitably involves more and longer batteries of multiple-choice testing, more scripted “vendor”-made lessons, and less autonomy for teachers.” (You can read about that here.)

Academics call for pause in PISA tests

Alas, the 2015 PISA test was given on schedule. And the latest PISA results from 73 countries, according to this story, show:

  • In math literacy, U.S. students ranked 40th in the world and actually fell over the last administration of the test three years ago.
  • In science literacy, they ranked 25th.
  • In reading literacy, they ranked 24th.

The results come just after the release of another international exam, the Trends in International Mathematics and Science Study (TIMSS), which tests fourth- and eighth-grade students. As you might suspect, U.S. kids didn’t do so well.

With every release of these test scores, there are new expressions of doom. Marc Tucker, president of the Washington-based National Center on Education and the Economy, said Tuesday that the PISA scores indicate the situation is dire enough to be considered a “Sputnik moment” for U.S. leaders and educators. (The launch of the Sputnik satellite by the Soviet Union in 1957 prompted the United States to emphasize math and science education to ensure victory in the space race.)

Tucker said that he wants the United States to figure out how Chinese students score so well, though Yong Zhao, a professor in educational leadership and policy studies at the University of Kansas, said that is the last thing Americans should do.

Zhao joined the University of Kansas this year as part of the Foundation Distinguished Professor initiative, launched by the school and the state to attract 12 eminent scholars in specific fields. He formerly was the presidential chair and professor in the Department of Educational Measurement, Policy and Leadership at the University of Oregon, and is the author of several books, including the co-authored “Never Send a Human to do a Machine’s Job: Correcting the Top 5 Edtech Mistakes.

Here he explains why PISA’s rankings are flawed:

By Yong Zhao

The results of the Brexit referendum and the U.S. presidential election will surely be seen as the biggest surprises of 2016. The final results defied all predictions; the polls were wrong — or poorly interpreted — as were the pundits. Though they predicted that the majority of British voters would choose to remain in the European Union, more ended up voting to leave. Though they predicted a win for the Democrat, Hillary Clinton, it is Republican Donald Trump who will move into the White House this January.

There is plenty of head-scratching and hand-wringing over the fact that so many experts got it so wrong, but a generally agreed-upon conclusion is that the data these experts had absolute confidence in somehow fooled them or simply “died.”

“I’ve believed in data for 30 years in politics and data died tonight. I could not have been more wrong about this election,” said GOP strategist and NBC analyst Mike Murphy  about the U.S. election on Twitter.

These two back-to-back spectacular failures of data-driven predictions remind us that data can be deceiving, misleading, and sometimes just quits working. Blind faith in data can have disastrous and long-lasting consequences; the failure of polls “serves as a devastating reminder of the uncertainty in polling, and a warning about being overconfident even when the weight of surveying evidence seems overwhelming,” wrote the Economist shortly after the U.S. presidential election.

Let’s carry this over into the world of education. We have just been recently been presented with two huge sets of data about education in the world as well as myriad interpretations. Data from the 2015 TIMSS came out at the end of November, and results from the influential international assessment program PISA were just announced.

And now the parsing of the data by pundits, journalists, and policymakers around the world will begin, with new recommendations for educational policy and practices.  commenting on the results, attempting to draw conclusions and make recommendations for educational policy and practices. For example, the Alliance for Excellent Education, a D.C.-based national policy and advocacy organization, has already declared December 6th PISA Day and created a website covering the results. It has planned a Deep Dive event the following day to discuss “PISA and the Global Economy.” Bob Wise, alliance president and former governor of West Virginia, writes on the website:

PISA Day not only provides a look at student performance through an international lens, it focuses on what lessons can be learned from other high-performing nations to ensure U.S. students — especially those who are underserved —  are prepared to compete in today’s global economy.

But if the PISA data, like most of the Brexit and presidential election data, is no good, would any conclusions drawn and recommendations made from this data be any good?

PISA is not a political poll, but it does attempt to predict the future. “PISA assesses the extent to which students near the end of compulsory education have acquired key knowledge and skills that are essential for full participation in modern societies,” according to its brochure. These students are 15 years old. In other words, PISA claims that its results predict the future success of individual students and, by association, their nations.

Unlike elections, one cannot definitively prove PISA predictions to be wrong since student success later in life cannot be conclusively reported like final vote counts. But if we think of a student’s success as winning the election, and the skills and knowledge PISA assesses as voters, what the polls missed during Brexit and the 2016 U.S. presidential election provides some interesting cautionary parallels.

The “Shy Tory” factor

British polls failed to predict the results of Britain’s 1992 general election by a large margin. One major factor was the “shy Tories,” Conservative supporters who refused to indicate their voting intentions. Although polling agencies tried to improve their methods after 1992, their efforts might not have been sufficient, as the same phenomenon was observed again in 2015, when opinion polls once again underestimated the Conservative vote. Similarly, “shy Trump voters” were a major contributor to the polling failures in the presidential election. In both cases, “shy voters” drastically compromised the quality of predictive polls.

PISA has a similar problem. Whatever it claims to measure, the range of skills and knowledge PISA can actually assess is very limited and homogeneous for all students in scores of economies. As a result, it easily misses skills and knowledge that ultimately matter more than what it measures, just like polls missed the opinions of “shy voters.” In other words, what eventually may matter for a student’s success does not show up in the data.

Thus, the first question to ask about PISA is: Does it accurately measure what matters?

“October surprise”

In U.S. presidential elections, pollsters are used to October surprises — unexpected events late in the campaign that could change the trajectory of the election. October surprises can also affect the accuracy of polls. Because they happen so close to Election Day, polls may not be able to fully capture their impact on voters and thereby predict accurately the final outcome.

PISA’s claim to assess skills and knowledge required for future success rests on the assumption that it confidently knows what is required for success in the future. While PISA and its expert sources may have done excellent work imagining the future, the impact of an “October surprise” in education would be much bigger, considering that election polls are only attempting to predict the results of a single event (the election), while PISA attempts to predict the future lives of 15-year olds across the world. Given the rapidity and scale of changes we have seen in just the last few decades, no one can say how many “October surprises” these 15-year olds will experience in their future.

Over the last few decades, automation has already altered the life trajectory of millions of manufacturing workers. E-commerce platforms such as Amazon and eBay have altered the fate of brick-and-mortar store owners and their workforces. Driverless cars are set to change the life of human drivers. And the Fourth Industrial Revolution will further change the nature of work and living in ways we do not and cannot know for certain.

Thus follows a second question to ask about PISA: How does it assess skills needed for an uncertain future with such certainty?

The likely voter problem

Finding the “likely voter” or correctly identifying who will vote has become an increasingly difficult task for poll operators, but it is one of the most important factors affecting polling accuracy. The majority of polls for Brexit and the U.S. presidential election failed to accurately forecast who actually came out to vote, either under- or over estimating the turnout of supporters for each side. Polls make predictions about likely voters using various models, but one thing they rely heavily upon are past elections under the assumption that previous voting patterns will continue. This time, they failed: “pollsters may have incorrectly ruled out the prospect that people who didn’t vote in 2012 would nonetheless cast ballots in 2016, ” according to a story in USA Today.

Sam Wang, the Princeton professor who ate a bug as promised if his projection of a Clinton win was proved wrong, said in an interview after the election: “[My] hypothesis was based on [elections from] 2004 to 2012 …. Now the hypothesis is wrong, and I have to go back and face up to the new data in an honest way.”

PISA’s confidence in the predictive power of its assessment also comes from the past. The subjects it chose to assess — science, reading, and math — have long been believed as important for success in life all over the world. They have been the core subject matter schools teach worldwide with the belief that they are essential for living in the modern age. But will these subjects turn out to help today’s 15-year-olds some 10, 20 or 30 years later? Are they the right candidates for all people in the future, or might different individuals need different sets of skills and knowledge?

This brings us to the third question to ask about PISA: Does its results in reading, math, and science accurately capture the domains of expertise each individual needs for successful participation in the future society anywhere in the world?

PISA not only tries to use its test results in these subjects to predict what skills and knowledge 15-year-olds will need to succeed in the future, it also disseminates, based on these scores, education policies and practices it believes will equip children with these skills and knowledge. It has the potential to affect the livelihood of hundreds of millions of children, hence the entire world.

The consequences are serious. The stakes are so high. Therefore, we must question the quality of the PISA results before eagerly jumping to conclusions. Don’t read too much into it.

P.S. For more about PISA’s problems and dangers, read my five-part series entitled, “How PISA puts the world at risk on my blog.”