Research Basics: Accounting for Chance

Discussion Policy
Comments that include profanity or personal attacks or other inappropriate comments or material will be removed from the site. Additionally, entries that are unsigned or contain "signatures" by someone other than the actual author will be removed. Finally, we will take steps to block users who violate any of our posting standards, terms of use or privacy policies or any other policies governing this site. Please review the full rules governing commentaries and discussions. You are fully responsible for the content that you post.
Tuesday, March 14, 2006

Even if scientists design the perfect randomized trial, eliminate all measurement error and keep track of every participant, they need to account for chance. That's because a study can show only what happened to the participants in that particular study, at that particular time. If the study were repeated -- even with the same participants -- the results may well differ simply by chance.

Here's a way to think about why this can happen. Imagine you have two identical jars, each containing 1,000 marbles -- 500 of them gray and 500 white. Put on a blindfold and pick out 100 marbles at random from Jar A. Count the gray marbles. Now do the same thing with Jar B.

If you didn't get exactly the same number of gray marbles from each jar -- if you drew, say 48 from Jar A and 52 from Jar B, or 45 vs. 55-- would you be surprised? Probably not, because this much variation is just the luck of the draw. But you might be surprised if you got 30 gray marbles from Jar A and 70 from Jar B. Could that happen? Yes, but not very often.

Statisticians quantify this common-sense intuition using formulas to calculate the p-value -- where "p" stands for probability. The p-value is the probability that a specific difference in two draws would occur just by chance. If you repeated the marble experiment thousands of times and graphed these differences, you would get a bell curve. (To see it, view the authors' Web site, at http://www.vaoutcomes.org/washpost.php .) The average difference in the number of gray marbles drawn will be zero. Most times the difference will be less than 10.

By convention, statisticians agree that when the p-value is less than 5 percent, it indicates a difference so surprising it is unlikely to have occurred by chance. In the marble example, you'd be in that territory once you get to differences greater than 13.

To see how this applies in the diet-and-cancer trials, pretend the jars represent low-fat or regular diet, and the marbles are people; the 100 selected from each jar are the study participants, and the gray marbles are new cases of breast cancer.

Drawing 42 breast cancers from the low-fat diet jar and 57 from the regular diet jar has a p-value of 3 percent. By convention, this difference is surprising enough to be considered the result of diet -- not chance. Drawing 43 from one and 56 from the other has a p-value of 7 percent. This difference is unsurprising enough to be judged due to chance. Although these p-values are not dramatically different, by convention the conclusions are. The first is statistically significant, the other is not. Although the number of people in both low-fat studies was substantially higher and the rate of breast cancer lower, the example provides a sense of how close the findings really were.

-- Steven Woloshin, Lisa Schwartz, H. Gilbert Welch



© 2006 The Washington Post Company