The problem with evidence-based education policy: the evidence

(freepik.com) (freepik.com)

Education policy-makers like to talk about “evidence-based” this and “evidence-based” that — but there are big questions about just how good the “evidence” actually is — and whether the people who are making big decisions in the world of education actually look at the research that we do know is solid. Looking at this issue is Larry Cuban, a former high school teacher and superintendent who now teaches at Stanford University. Cuban was a high school social studies teacher for 14 years, a district superintendent (seven years in Arlington, VA), and professor emeritus of education at Stanford University, where he has taught for more than 20 years. His latest book is “Inside the Black Box of Classroom Practice: Change without Reform in American Education.” This post appeared on his blog.

 

By Larry Cuban

The historical record is rich in evidence that research findings have played a subordinate role in making educational policy. Often, policy choices were (and are) political decisions. There was no research, for example, that found establishing tax-supported public schools in the early 19th century was better than educating youth through private academies. No studies persuaded late-19th century educators to import kindergarten into public schools. Ditto for bringing computers into schools a century later.

So it is hardly surprising, then, that many others, including myself, have been skeptical of the popular idea that evidence-based policy-making and evidence-based instruction can drive teaching practice. Those doubts have grown larger when one notes what has occurred in clinical medicine with its frequent U-turns in evidence-based “best practices.”

Consider, for example, how new studies have often reversed prior “evidence-based” medical procedures.

*Hormone therapy for post-menopausal women to reduce heart attacks was found to be more harmful than no intervention at all.

*Getting a PSA test to determine whether the prostate gland showed signs of cancer for men over the age of 50 was “best practice” until 2012 when advisory panels of doctors recommended that no one under 55 should be tested and those older  might be tested if they had family histories of prostate cancer.

And then there are new studies that recommend women to have annual mammograms, not at age  50 as recommended for decades, but at age 40. Or research syntheses (sometimes called “meta-analyses”) that showed anti-depressant pills worked no better than placebos.

These large studies done with randomized clinical trials–the current gold standard for producing evidence-based medical practice–have, over time, produced reversals in practice. Such turnarounds, when popularized in the press (although media attention does not mean that practitioners actually change what they do with patients) often diminished faith in medical research leaving most of us — and I include myself — stuck as to which healthy practices we should continue and which we should drop.

Should I, for example, eat butter or margarine to prevent a heart attack? In the 1980s, the answer was: Don’t eat butter, cheese, beef, and similar high-saturated fat products. Yet a recent meta-analysis of those and subsequent studies reached an opposite conclusion.

Figuring out what to do is hard because I, as a researcher, teacher, and person who wants to maintain good health has to sort out what studies say and  how those studies were done from what the media report, and then how all of that applies to me. Should I take a PSA test? Should I switch from margarine to butter?

If research into clinical medicine produces doubt about evidence-based practice, consider the difficulties of educational research — already playing a secondary role in making policy and practice decisions — when findings from long-term studies of innovation conflict with current practices.

Look, for example, at computer use to transform teaching and improve student achievement.

Politically smart state and local policymakers believe that buying new tablets loaded with new software, deploying them to K-12 classrooms, and watching how the devices engage both teachers and students is a “best practice.” The theory is that student engagement through the device and software will dramatically alter classroom instruction and lead to improved  achievement. The problem, of course (you no doubt have guessed where I was going with this) — is that evidence of this electronic innovation transforming teaching and achievement growth is not only sparse but also unpersuasive even when some studies show a small “effect size.”

Turn now to the work of John Hattie, a professor at the University of Auckland (NZ), who has synthesized the research on different factors that influence student achievement and measured their impact on learning. For example, over the last two decades, Hattie has examined over 180,000 studies accumulating 200, 000 “effect sizes”  measuring the influence of teaching practices on student learning. All of these studies represent over 50 million students.

He established which factors influenced student learning–the “effect size–by ranking each from 0.1 (hardly any influence) to 1.0 or a full standard deviation–almost a year’s growth in student learning. He found that the “typical” effect size of an innovation was 0.4.

To compare different classroom approaches shaped student learning, Hattie used the “typical” effect size (0.4) to mean that a practice reached the threshold of influence on student learning (p. 5). From his meta-analyses, he then found that class size had a .20 effect (slide 15) while direct instruction had a .59 effect (slide 21). Again and again, he found that teacher feedback had an effect size of .72 (slide 32). Moreover, teacher-directed strategies of increasing student verbalization (.67) and teaching meta-cognition strategies (.67) had substantial effects (slide 32).

What about student use of computers (p. 7)? Hattie included many “effect sizes” of computer use from distance education (.09), multimedia methods (.15), programmed instruction (.24), and computer-assisted instruction (.37). Except for “hypermedia instruction” (.41), all fell below the “typical ” effect size (.40) of innovations improving student learning (slides 14-18). Across all studies of computers, then, Hattie found an overall effect size of .31 (p. 4).

According to Hattie’s meta-analyses, then, introducing computers to students will  fall well below other instructional strategies that teachers can and do use. Will Hattie’s findings convince educational policymakers to focus more on teaching? Not as long as political choices trump research findings.

Even if politics were removed from the decision-making equation, there would still remain the major limitation of  most educational and medical research. Few studies  answer the question: under what conditions and with which students and patients does a treatment work? That question seldom appears in randomized clinical trials. And that is regrettable.

 

Valerie Strauss covers education and runs The Answer Sheet blog.
Continue reading
Comments
Show Comments
Most Read Local