A 2013 survey of roughly 2,000 open-source developers found that just 11.2 percent were women. A 2012 study of Stack Overflow, a question-and-answer community for programmers, unearthed an “unhealthy” environment for female participants, who tend to disengage from conversations sooner than their male counterparts.
To better understand the persistent disparity, the group from California Polytechnic State University and North Carolina State University analyzed the gender dynamics of one of the world's largest open-source communities and discovered a puzzling trend: Programmers were more likely to accept women's codes.
The women's acceptance rate dropped, however, if they didn't mask their gender.
In their research, published this month, the students explored data from GitHub, where an estimated 12 million users labor away at 31 million software projects. Every day, users propose new code tweaks or fixes and are met with acceptances or rejections.
The researchers mined public data about GitHub users and projects and used GitHub websites and social media profiles to break users into two categories: self-identified men and self-identified women. Then then they compared how frequently programmers accepted the contributions of those in each group. (The work awaits peer review. Find more on their methodology here.)
For those of us unfamiliar with the coding world, this is what pitching an idea, or making a “pull request," can look like:
The data revealed a curiosity. Women saw higher acceptance rates (or “merge rate”) than men:
“What could explain this unexpected result?” the authors wrote. “Perhaps women’s high acceptance rate is because they are already well known in the projects they make pull requests in.”
That didn’t appear to be the case. After the researchers excluded so-called insiders from the analysis, the women’s acceptance rate (64.4 percent) remained higher than the men’s (62.7 percent).
The researchers dug further.
Did the women just offer more urgent, problem-solving solutions, boosting their numbers? (No, the men actually did.) Did they provide smaller, less time-consuming features at a faster pace? (No, women proposed more big changes.) Did they disproportionately excel at one particular programming language? (“We observe that women’s acceptance rates dominate over men’s for every programming language in the top ten, to various degrees.”)
The authors theorized women's higher acceptance rate was the result of a phenomenon they called "survivorship bias."
“As women continue their formal and informal education in computer science, the less competent ones may change fields or otherwise drop out," they wrote. "Then, only more competent women remain by the time they begin to contribute to open source. In contrast, less competent men may continue.”
It was also possible, they noted, that women received special treatment for being, well, women.
Many users in open-source communities, however, operate under gender-neutral names and gender-neutral avatars. For example, a programmer could go by “danpaq” and display a photo of a dog.
The researchers filtered the data, dividing the acceptance rate of developers with gender-neutral GitHub profiles and those who have obviously gendered GitHub profiles. “Danielle” with a feminine face, then, would register as “woman.”
And so another bias appeared, according to the study. When a woman’s gender was identifiable, she was more often rejected.
The female developers' acceptance rates were 71.8 percent when they used gender-neutral profiles — but dropped to 62.5 percent if they were clearly women. Programmers who appeared to be men also saw an acceptance dip, the researchers noted, but it wasn't as strong as the drop for women.
It's worth noting the study received criticism from some tech bloggers, who think the researchers should wipe the data of identifying information and release it for public scrutiny.
Exposing uncomfortable patterns, meanwhile, doesn't prove anything. A programmer could ignore a pull request because of sexism or, perhaps, a busy afternoon.
Regardless of what's happening on GitHub, gender discrimination is worth investigating further, the authors argued. Women hold less than a quarter of software development jobs in the United States, federal data shows.
“As anecdotes about gender bias persist, it’s imperative that we use big data to better understand the interaction between genders,” they concluded. “The trends observed in this paper are troubling. The frequent refrain that open source is a pure meritocracy must be reexamined."
More from Wonkblog: