(Scott Page — The Diversity Bonus)

Scott Page is a professor of political science at the University of Michigan, and the author of The Diversity Bonus, a new book based on his research on diversity and collective decision-making (some of which has been developed and presented at workshops organized by the MacArthur Network on Opening Governance). I asked him questions about the implications of his work.

HF: Former Supreme Court justice Antonin Scalia suggested that there was a trade-off between allowing more diversity (e.g. admitting more diverse students to a university) and higher quality. Your book argues that Scalia is dead wrong about many situations. Why?

SP: I argue that Scalia is right in some cases but wrong in the cases that were before the court. Scalia assumes that we have some straightforward method for evaluating people’s ability and it’s one dimensional. He then assumes that the overall ability of a group is no more than and no less than the sum of the abilities of its members.

That calculation may be right for simple routine tasks like chopping wood. If I am able to chop down 10 trees an hour, and you can chop eight, together we can chop 18. But colleges, universities and big firms carry out more complicated tasks — designing aircraft, conducting neuroscience research and analyzing health-care policies. Here, unlike woodchopping, we cannot reduce ability to a single dimension (the number of ideas that an engineer has per hour about how to design aircraft wings is a silly measure of intelligence). And, even if you could measure ability, the team or group’s ability would not be the sum of its members abilities (by working together on designing an aircraft, we may have better ideas than either of us could ever arrive at on our own).

Instead, the ability of the team depends on the knowledge, skills and ways of thinking that its members possess. As I lay out in my book, the best team of people to carry out a complex task will not necessarily be the people who score best at any particular test. It should instead be a team of people with different cognitive toolboxes and bases of knowledge to draw upon.

The cases before the court were not about hiring woodchoppers.  Scalia’s arguments are demonstrably a bad fit for research universities — where his logic was being applied — which confront complex problems where neither assumption holds. You can only make a university work properly if you draw on diverse people, with diverse points of view and understandings. His logic also fails in the tech industry, pharma, consulting, finance and just about anywhere in the knowledge economy.

HF: What is the “bonus” that diversity of viewpoints can produce in situations where people have to figure out the answers to complex problems?

SP: I’m a mathematician by training, so when I say that diversity of viewpoints can produce bonuses, what I mean is that we can measure team performance and demonstrate that the diversity of the team adds a measurable bonus.

Imagine that you ask five people to make predictions about the unemployment rate or the vote share of a presidential candidate in an election. Then take the average of their individual predictions, and call this the group’s prediction. The group’s prediction will have less error than the average error of its individual members. The amount by which the group is better than its average member — the “bonus” — corresponds to the diversity of their predictions. Similar arguments explain why problem solving, innovating and verifying all produce quantifiable diversity bonuses. The math’s all in the book.

The diversity bonus logic differs from a portfolio logic in which people buy a portfolio of stocks to insulate themselves from risk. The return from a portfolio of stocks equals the average of the individual stocks’ returns. It does not produce any bonuses. When we look at how diversity works in cognitive tasks, it does more than spread risk. It produces bonuses. Thus, the math suggests your teams should be more diverse than your stock portfolio.

HF: While you certainly do not discount ethical or normative reasons to promote diversity (e.g. to help people who have been underrepresented in the past), you also suggest that diversity hiring based on normative principles alone will not usually produce a bonus. Why not?

SP: The logic and evidence demonstrate that cognitive diversity improves outcomes through bonuses. On many tasks, identity diversity (whether, for example you have a mix of men and women, or a mixture of people from different racial or ethnic groups, or different economic classes) correlates with cognitive diversity. When it does, identity diversity can also produce bonuses.

Achieving those bonuses requires the right culture. As the work of Jeff Polzer and others show, people must feel validated and trusted if they are fully to contribute the benefits of different ways of thinking. Bonuses become larger if people engage with and challenge one another’s ideas, a point driven home by Katherine Phillips in her response chapter at the end of my book.

So, let’s suppose that an organization relies on normative principles to hire/admit and promote. The organization builds teams of people with different identities based on this normative agenda. The organization pays no attention to cognitive diversity. It cares only that it has appropriate numbers of people from each identity group.

That approach buys into Scalia’s incorrect assumptions about ability and team performance. It acts as if there is a trade-off between diversity and performance. In such an environment, team members have no reason to think that diverse group members can make the team better. They have no reason to seek the bonuses that their diverse views, knowledge and understanding could help them capture. That’s unfortunate. David Thomas and Robin Ely have shown that a team that does not seek bonuses will probably not produce them.

The argument for bonus thinking is pragmatic: If you pursue diversity without considering diversity bonuses, you put together suboptimal teams and you do not create the culture for those teams to perform well. If, instead, you keep an eye on diversity bonuses, you build more powerful teams that seek better outcomes.

Consider the recent flurry around all male panels (“manels”). If the motivation for diverse panels is based only on normative reasoning, conference organizers will look to find the “best woman” or the “best man” to create diversity. They adopt a trade-off — split the pie — perspective. Instead, organizers should ask: What’s the purpose of the panel, what different perspectives, knowledge bases and tools should we have on the panel? In addition, the organizers should be aware (see again Katherine’s chapter) that groups made up of people with diverse identities produce different and deeper explanations, and, that groups with no women perform less well on collective IQ tests (here, see the research of Tom Malone and Anita Woolley). We want diverse panels, and fewer “manels,” because they will be better and more productive.

The same logic applies to writing papers. Analyses by a diverse group of scholars including Brian Uzzi, Daniel Romero, Lada Adamic, Richard Freeman, Melissa Schilling and Brian Jones shows that the best research and patents is produced by cognitively diverse teams. Some of these studies cover upward of 20 million papers and 5 million patents. To choose homogeneity is to ask for a B+ rather than an A.

HF: You suggest that ideas from machine learning — such as “bagging” and “boosting” — could help improve the ability of diverse groups to figure out problems. What are they, and how can they be applied to real people?

SP: Machine learning is typically used to classify data — to figure out whether a photo has a cat or a dog, and classify it accordingly, or whether a potential customer is a good or bad credit risk. Bagging is a method where you give machine learning algorithms different data sets from which to learn. This builds in diversity. Boosting refers to giving extra weight to harder tasks of classification.

These same ideas can be applied to groups of people. You want people with different experiences (different data — so bagging). And you want some people who focus on the hard problems (boosting).

Most universities do this in the tenure process. We ask for outside letters (bagging). We have a university oversight committee which has experience with the tough cases (boosting).

In the book, I frame the amazing success of bagging and boosting to create advances in artificial intelligence as strong evidence of diversity bonuses. Computer scientists do not add in algorithms with different experiences (data sets) to be culturally inclusive. They do so because they want more accuracy. They are thinking in terms of bonuses and achieving them.

HF: What is the meritocratic fallacy and how does it hamper good decision-making?

SP: The meritocratic fallacy is the idea that we can apply a test to people, rank them and that the best team consists of the best scorers on that test. That’s not true on complex tasks with many different dimensions on which we want to measure performance. Yet, many organizations and society writ large often operate as if this fallacy holds.

Lu Hong and I identified this fallacy in an early paper. Leandro Marcolino, who studied how algorithms work for the game of Go, has reached similar conclusions. What we all find is that the best team of algorithms does not consist of the best algorithms. In other words, you often want to prioritize having diversity on your team over having the individually best algorithms (or people), if you want the best results. More recent work, including a paper by Jon Kleinberg and Maithra Raghu and a result by Lu Hong, shows that for many problems no test exists such that the best team consists of the best individual performers on that test.

What this means is that meritocratic rules intended to choose the individually “best” people do not produce the best teams. If a meritocracy consists of the best teams, then we must choose diverse people not the “best” people according to an arbitrary test.

This article is one in a series supported by the MacArthur Foundation Research Network on Opening Governance that seeks to work collaboratively to increase our understanding of how to design more effective and legitimate democratic institutions using new technologies and new methods. Neither the MacArthur Foundation nor the Network is responsible for the article’s specific content. Other posts in the series can be found here.