By Andy Hargreaves and Henry Braun
One of the most unlikely bestselling books and blockbuster movies of the past decade is a story about how baseball statistics turned from being a nerdy preoccupation of obsessive fans into a powerful tool for dramatically improving team performance. Michael Lewis’s Moneyball showed how insightful use of a range of performance statistics to select and deploy players boosted the underfunded Oakland Athletics to World Series standard, where they faced and often defeated teams with triple their payroll. For the Oakland A’s, statistics often trumped coaches’ intuitions. The use of performance stats that Oakland pioneered in the United States is now commonplace throughout professional sports.
Performance metrics are also ubiquitous in business. Most companies now monitor a plethora of indicators — from product defect ratios to speed of output, from customer satisfaction to internet “stickiness” — to pinpoint performance problems and prompt real-time interventions. The brutal metrics of victories and defeats and of profits and losses, hold sports teams and companies accountable to fans and shareholders alike. In business and sports, we can get data-driven improvement and accountability (or DDIA) at the same time. Why should public education be any different?
One of the biggest buzz-words in education today is accountability. Accountability is seen as a strategy for improvement, by rewarding the successful and eliminating or intervening forcefully with those who are not. And it is mainly enforced through indicators of student achievement derived from standardized test scores. Test scores purport to reveal the successes or failures of students, teachers, schools and even entire educational systems.
As in business and sports, student test score data also have a second purpose – to focus everyone’s efforts on just-in-time improvement by providing ongoing information through many tests about students’ progress. This enables teachers to monitor individual students and to intervene immediately if they start to fall behind. It provides principals with data that indicate how their teachers are performing, and to take appropriate action. And it enables districts and state departments to know what is happening in every school, so that corrective action can be taken before it is too late.
So are the opponents of student testing just unrealistic romantics who are out of date and out of touch? Perhaps they are not just resisting the rightful insistence that they should be held accountable to the public. They may also be rejecting the data tools that would enable them to improve.
In our policy brief Data-driven Improvement and Accountability (DDIA), published today by the National Education Policy Center, we examine the linkage between improvement and accountability in education, especially in relation to their use of data. Our paper is based on the research we have done on the use of data in high performing countries, in business and sports, as well as the advice we are now providing to state and provincial departments of education that are revamping their testing instruments and accountability policies.
What have we found? DDIA is, in itself, neither good nor bad. It all depends on how the data are defined and used. When DDIA is done thoughtfully, with due respect to the strengths and limitations of the data, it provides educators with valuable feedback on their students’ progress and difficulties that can inform decision-making and even lead to changes in practice. It can also give parents and the public meaningful information about student learning and the quality of the education that students are receiving.
In high-performing educational systems, businesses and sports teams, DDIA systems are based on data that are valid, balanced, usable, stable and shared. But in the United States, up until now, these are not the typical characteristics of DDIA, with the result that DDIA has generally impeded improvement and undermined accountability.
The answer is not to avoid data or abandon all testing but, rather, to learn from high performing systems and organizations elsewhere. Here are some of the key principles and practices identified in our report.
- Validity: Measure what is valued instead of valuing only what can easily be measured, so that the purpose of schooling is not distorted. Few of the individuals we have interviewed dispute the validity and utility of metrics such as customer satisfaction in business or blocked shots in hockey. But almost all educational professionals, and increasing numbers of the public, regard test scores as weak measures of what our children should be learning in the 21st Century: The ability to think deeply about a topic or to apply knowledge to unfamiliar problems, the capacity for entrepreneurialism or innovation, or even to read for pleasure as well as for proficiency. The result is that schools and educators are driven largely by what is easily measured, not what is highly valued.
- Balance: Create a balanced scorecard of metrics and indicators that captures the full range of what the school system values. Urban communities in high performing Finland followed best business practice by having a “balanced scorecard” of different measures to judge their progress. High performing businesses and sports organizations also use many metrics, defying the notion of a single ladder of success. But in US education, we have largely relied on a very small number of metrics, such as test scores and attendance, to judge performance. This encourages schools to boost the numbers on a few metrics rather than attending to a wider range of indicators that offer the clues and keys to improvement.
- Insist on credible, high quality data that are stable and accurate. High performing systems do not have a “bullwhip” overreaction to short term peaks or dips in a quarter or even a year that could be just an anomaly, but give equal attention to medium-term trends to guide decision making. Unfortunately, the data in public education systems and, especially, individual teachers’ classrooms are often highly unstable because of small numbers or high student mobility. Eventually this volatility undermines the credibility of the entire accountability system.
- Design and select data that are usable in real time. Useful video and numerical data are available to coaches within days so they can have timely conversations with their players about passes made or shots that have been blocked. Internet retailers get instant feedback on their website use so they can make continuous and incremental improvements to their digital platform. But achievement tests are usually taken and the results handed back at the end of the school year when it is too late to do anything for the students who took them. The emphasis should shift back to collecting real-time data for improvement during the course of the year.
- Develop shared decision-making and responsibility for data analysis and improvement. The best organizations set shared targets for improvement that are owned by everyone, not simply imposed from on high. We documented this in one of London’s most turned around school districts – Tower Hamlets. In places like Ontario, Canada – one of the two highest performing English speaking educational systems in the world – we have seen how leaders enable all teachers to assume collective responsibility for all students’ achievement, across grades and between special education and regular classroom teachers. In these settings, data are collected and organized to stimulate purposeful collective discussions and to inform interventions for real children who are known personally by the teachers — not to game the system to avoid the consequences of failure.
- Be the drivers, not the driven, so that statistical and other kinds of formal evidence complement and inform educators’ knowledge and wisdom concerning their students and their own professional practice, rather than undermining or replacing them. Post-Moneyball, we are learning that coaches’ intuition and judgment still matter alongside the numerical data. This is surely true for teachers as well.
The current U.S. emphasis on narrowly defined, high-stakes measures based on student test scores creates perverse incentives for educators to narrow the curriculum, teach to the test and allocate their efforts disproportionately to those students who are likely to yield the quickest test score gains, rather than those that may have the greatest needs. It also contributes to an adversarial school climate, rather than one of collective responsibility and collective action. This is not only contrary to best practices in other countries and sectors, but also detrimental to improvement and accountability.
It is time for the United States to rethink its strategy. One place to start would be to create a set of guiding and binding national standards for DDIA that draw on the best practices of other systems. These should encompass content standards for accuracy, stability and validity of DDIA indicators; process standards for the leadership and conduct of professional communities and data teams that develop collective responsibility for all students’ success; and context standards regarding entitlements to adequate training, resources and time to participate in DDIA effectively.