Correction to This Article
This column incorrectly identified the founding director of the U.S. Department of Education's Institute of Education Sciences. Russ Whitehurst, now at the Brookings Institution, was the institute's first director.
In studying behavior, scientific testing has advantages -- and limits

By Steven Pearlstein
Wednesday, August 4, 2010; A12

Students at charter schools tend to get higher scores on achievement tests than those who remain in public schools, but is it the schools that account for the difference or just the ambitions of the charter school parents?

Microfinance has proved to be an effective tool for getting capital to entrepreneurs in developing countries, but does it fundamentally improve living standards?

A restaurant chain knows it can increase business by modernizing its restaurants, but what level of investment will yield the biggest bang for the buck?

These are examples of the challenging questions that face policymakers and business executives every day. What makes these questions so hard to answer is that while past experience may be helpful, there's often no easy way to untangle a particular outcome from the various factors that may have caused it.

It was to solve this kind of problem that Galileo began dropping cannonballs of different weights from the tower at Pisa, and from this "experimental method" modern science developed. A big leap came in the 18th century when a Scottish surgeon, James Lind, divided a dozen sailors into six groups, each given a different "cure" for his scurvy. After six days, the two sailors who had been given two oranges and a lemon had recovered, and the random clinical trial was born.

It is only recently that random trials have made their way from the hard sciences to business and social sciences, where they're now attracting lots of money and attention.

Here in Washington, for example, Capital One grew from a start-up to one of the world's largest credit card issuers over the past 20 years largely through the aggressive use of the experimental method. Founders Rich Fairbank and Nigel Morris developed an elaborate system for constantly testing the success of new products and marketing pitches with customers in every region of the country.

Although direct-marketing companies such as Capital One were the pioneers, the Web has accelerated the spread of randomized testing to other industries. On any given day, Google, Amazon and eBay are running thousands of real-time trials and companies such as Harrah's Entertainment and TD Bank insist that some form of trial is used in every major initiative.

Another Washington area firm, Applied Predictive Technologies, is a leading producer of software that enables businesses to run their own randomized tests. By mining the wealth of data already in a company's computer, APT makes it possible to quickly test the impact of a new product or marketing tactic by comparing the outcomes against those of a "placebo" control group.

Red Lobster, for example, used APT software to test nine remodeling schemes for its restaurants, mixing and matching low-, medium- and high-cost options for interior and exterior designs. While the chain's finance chief, Bill Lambert, won't say which combination won, he did report that the winner boosted sales by 8 percent -- a useful guide when you're about to invest $200 million.

Autumn McDonald, director of Kraft Foods' test-and-learn group, explains that randomized testing allows the company to profitably develop niche products that would never have seen the light of day back when every product had to pass muster in prototypical white-bread supermarkets. Kraft can now predict what products will do well in what markets among which consumers, broken down by the size of the store, the time of the year and the type of packaging and promotion.

Family Dollar Stores turned to APT software to test whether it was worth installing refrigeration units in its 6,800 outlets that, up to that point, had sold only dry goods. What it found, based on a test of only a few dozen stores, was that the impact was far greater than the sales gains in milk, eggs and frozen pizza. The bigger impact on profit, according to the chain's chief merchandising officer, Dorlisa Flur, came from increased volume in its traditional dry goods.

Business is not the only sector that has fallen for randomized testing. This year's John Bates Clark medal, awarded every two years to the most promising young economist, went to Esther Duflo, an MIT professor whose Poverty Action Lab uses randomized testing to evaluate the effectiveness of poverty-

reduction initiatives. Duflo, 37, and her colleagues studied what happened in 52 neighborhoods in Hyderabad, India, when a microfinance company began offering loans, comparing the impact on household consumption with what happened in 52 other neighborhoods where no lending was introduced. Their paper, "The Miracle of Microfinance," concluded that there was no miracle at all, calling into question the latest fad in developmental economics.

Here at home, ideologues and special interest groups have been fighting bitterly over how to fix failing school systems. With so much conflicting data and opinion, the Department of Education in 2002 set up the Institute of Education Sciences to bring some scientific rigor to the subject. One of the institute's first declarations was that, in trying to determine what worked, it would only fund research that relied on randomized trials.

According to Russ Whitehead, the institute's first director who is now at the Brookings Institution, that research wound up discrediting a number of once-popular ideas, such as George W. Bush's Reading First initiative. On the other hand, a study that compared the test results of students chosen by lottery to attend charter schools with the results of the unlucky students who lost the lottery -- what analysts call a "natural experiment" -- provided clear-cut proof that charter schools were living up to their promise.

Jim Manzi, the peripatetic founder of APT, sees education as a growth market, but he also cautions against thinking that randomized testing will ever bring the kind of certainty to social policy that it has to physics or chemistry. In an article published this week in the conservative City Journal, Manzi reminds us that that unlike gravity or atoms, people in one region, or culture, or moment in time, don't predictably behave the same way as humans in other settings. Try as we might, try as we should, we may never achieve a fully scientific understanding of human behavior.

© 2010 The Washington Post Company