The Arlington County public school district is inviting number crunchers from Sterling, Silicon Valley and even Singapore to help solve one of the most vexing problems in public education: how to keep children from dropping out of school.
Officials plan to select about 10 teams of data scientists to comb through a trove of student information, stripped of identifying details, to find previously unnoticed trends that, if addressed early, could improve student outcomes.
The most compelling results, as determined by a panel of educators and data analysts, will earn a $10,000 prize.
Getting more students to graduate has become a top priority for school systems around the country. Arlington County, a relatively high-performing school system, posted an enviable 6 percent dropout rate for the Class of 2013, down from 13 percent in 2008. Still, officials said they want to do better.
Arlington Schools Superintendent Patrick K. Murphy said he was inspired to turn to “big data” for solutions after reading about how President Obama’s reelection campaign relied on data from fundraisers, field workers, pollsters and other sources to predict outcomes in swing states and to target messages to specific audiences.
“How can we do this in public education?” Murphy wondered.
For decades, education data have been used mostly for accountability and compliance reporting. But increasingly they’re being relied on to refine day-to-day instruction and to give feedback to parents and the public about student and school performance. Schools also are using data to develop warning systems that identify the early experiences that are most likely to lead students to success in college or careers, as well as the experiences that lead students to drop out.
More than 30 states have created systems that flag these trends, such as reading below grade level or repeating a grade, up from 18 in 2011, according to the Data Quality Campaign, an advocacy group that promotes the use of data to improve education.
A Montgomery County analysis released last year found that students could be on the path to dropping out if they had been suspended or were performing below grade level in reading or math as early as first grade.
This embrace of data-driven decision-making comes as many state and local education departments have new centralized data systems that track students from pre-kindergarten through college or beyond.
These giant systems combine information about bus routes, attendance, demographics, grades, course selection, test scores and college matriculation. Skilled trend-spotters can use the numbers to chart out promising or perilous pathways to success or failure.
But it’s not easy to wade through an ocean of cells and spreadsheets, and talented analysts are in high demand. So Arlington County schools partnered with Kaggle.com, a San Francisco start-up that offers an online platform for data-mining contests.
The organization represents an international network of more than 130,000 data scientists. The Web site has hosted more than 300 contests, many from private companies, with challenges to optimize flight routes based on current weather and traffic patterns or to predict which customers will leave an insurance company within 12 months. One NASA-sponsored challenge asked for new algorithms to measure the shapes of galaxies by accounting for the distortions in images caused by dark matter.
In the Arlington contest, teams will gain access to 12 years of student data, including assessment scores, schools attended, courses taken, grades, absences, demographic information and graduation status.
The data will not contain identifying information, said Rajesh Adusumilli, Arlington’s assistant superintendent for information services, and users will be required to keep the information secure.
The deadline for applying is Sunday at midnight. Contestants will be notified and given access to the data a week later. Awards are expected to be announced by the end of February. The $10,000 award is funded by the California-based CK-12 Foundation, which produces open-source online textbooks.
Releasing data to outside experts is a growing trend in the public sector. The Obama administration launched an open-government initiative that encourages the public to access federal data and use them to solve problems. Cities have begun releasing data sets about subway arrival times, restaurant inspections and traffic violations and asking for help in devising more efficient services.
Public education is beginning to join the craze. New York City public schools this fall gave half a dozen developers access to large amounts of school-level data as part of a “school choice design challenge ” to develop apps to help families navigate the high school lottery.
“Increasingly, people are considering this [data] a public resource. At the end of the day, it was created with public dollars,” said Chris Kingsley, a policy analyst at the Data Quality Campaign. “If we can publish it and let other people come in and look at it, we can derive more value out of this data.”
The Virginia Department of Education co-sponsored a series of competitions in 2012, including four simultaneous “hackathons,” to develop software applications that use or analyze education data in new ways. The contest aimed to raise awareness of the state’s new longitudinal data system, which has more than 700 data categories.
A team in Norfolk created a system to help parents of children with special needs find schools in their area serving children with similar disabilities.
“We worked really, really hard to collect and support this data, and we want people to use it in ways to improve our education system,” said Bethann Canada, director of educational information management for the department.
Now, Arlington is looking for new insights, with help from the wider world of techies.
Canada said it’s especially helpful to have noneducators look at these numbers, because they are not coming in with their own theories about possible solutions.
Through a competition like this, she said, “you are harnessing the power of millions of minds out there.”