What can we do to improve such predictions? Could we improve accuracy by bringing forecasters together, training them, taking advantage of the wisdom of crowds and applying other insights from the decision sciences? We decided to try, working with an interdisciplinary group of scholars, to improve geopolitical forecasting accuracy as part of a multi-year forecasting tournament funded by the Intelligence Advanced Research Projects Activity (IARPA). IARPA, the experimental research and development arm of the intelligence community, wanted to find new ways to generate accurate forecasts. They selected five university and industry programs to compete to find the best possible ways of identifying better forecasters, eliciting predictions and aggregating predictions across forecasters.
Our team, the Good Judgment Project, won that competition. In January, in an article for the Journal of Experimental Psychology: Applied, we explored the profile of the best out of hundreds of forecasters who made over 150,000 predictions on roughly 200 events during a two-year period.
Forecasters were asked a multitude of questions, such as: Will the United Nations General Assembly recognize a Palestinian state by Sept. 30, 2011? Will Bashar al-Assad remain president of Syria through Jan. 31, 2012? Before March 1, 2014, will North Korea conduct another successful nuclear detonation? What will be the lowest end-of-day price of Brent Crude Oil between Oct. 16, 2011 and Feb. 1, 2012? What will be the number of registered Syrian refugees reported by the UNHCR (the U.N. refugee agency) as of April 2012?
All of the questions had clear-cut answers, unlike many of the fuzzier issues debated by political pundits. This allowed us to say who was right, who was wrong and by how much.
Forecasters were volunteers from around the world who had learned about the tournament from professional societies, blogs, word-of-mouth and university research centers. They had at least a bachelor’s degree, and more than two-thirds had additional education. Overall, they were a very talented group – well above average in both intelligence and political knowledge relative to the general population. Over a two-year period, some people did very well, while others might just as well have guessed randomly. Moreover, these accuracy scores were surprisingly consistent over the study period.
We discovered three key factors that predicted geopolitical forecasting accuracy.
First, psychological factors, including inductive reasoning, pattern detection, open-mindedness and the tendency to look for information that goes against one’s favored views, especially when combined with political knowledge, helped forecasters make accurate predictions.
Second, forecasters benefited from conditions tested in controlled experiments to determine the best environments for making accurate forecasts, including training in probabilistic reasoning and participation in collaborative teams that shared information and discussed rationales.
Third, effort mattered. Forecasters who made predictions on more questions, updated their predictions more often and spent more time deliberating about their predictions had a decisive edge.
The best forecasters also believed they could learn to make better predictions – they viewed forecasting not as an innate ability, but rather as a skill that required deliberate practice, sustained effort and constant monitoring of current affairs.
Although we were initially unsure whether it was even possible to develop skill in geopolitical forecasting, our research shows that some people are exceptionally accurate over long periods of time. These people tended to share all of the qualities described above, and took advantage of their training in probabilistic reasoning and the advantages of working together in teams.
Can this be learned? For any type of skill to develop, two conditions must be present: an environment with sufficient stability to permit learning and opportunities for practice. Skill development also occurs when people care enough to engage in deliberative rehearsal. Our forecasters received constant feedback with accuracy scores and leaderboard rankings as each question closed and scores were provided. They also had many chances to learn; forecasters were given almost 200 questions over two years. Participants each made an average of 121 forecasts. These conditions enabled a process of learning-by-doing and help to explain why some forecasters achieved far-better-than-chance accuracy.
Our findings could yield important lessons for both the U.S. national security community and others in government and the private sector interested in improving forecasting accuracy.
In the real world, many analysts inside and outside the government make non-numerical forecasts that are vague and hard to assess for accuracy, so feedback is often absent. Feedback is essential for learning, though. We must keep score, and there is no way to do that without precise forecasting and some sort of accountability. That’s harder than it sounds. Accountability can be like a Ping-Pong game in which analysts are incentivized to shift their predictions depending on the direction of the most recent error. They are likelier to say “signal” when recently accused of under-connecting the dots (i.e., 9/11) and to say “noise” when recently accused of over-connecting the dots (i.e., weapons of mass destruction in Iraq). With this process, improvement is impossible. By harnessing the wisdom of crowds with the tools Good Judgment Project developed, we can build on what we know, keep improving our skills and become more accurate in our forecasting of geopolitical events.
Barbara Mellers is the I. George Heyman University Professor of Psychology at the University of Pennsylvania. Michael C. Horowitz is an associate professor of political science at the University of Pennsylvania. You can follow him on Twitter: @mchorowitz. You can learn more about the Good Judgment Project at https://www.goodjudgmentproject.com/