Facebook is embarking on a major overhaul of its algorithms that detect hate speech, according to internal documents, reversing years of so-called “race-blind” practices.
The overhaul, which is known as the WoW Project and is in its early stages, involves re-engineering Facebook’s automated moderation systems to get better at detecting and automatically deleting hateful language that is considered “the worst of the worst,” according to internal documents describing the project obtained by The Washington Post. The “worst of the worst” includes slurs directed at Blacks, Muslims, people of more than one race, the LGBTQ community and Jews, according to the documents.
As one way to assess severity, Facebook assigned different types of attacks numerical scores weighted based on their perceived harm. For example, the company’s systems would now place a higher priority on automatically removing statements such as “Gay people are disgusting” than “Men are pigs.”
Facebook has long banned hate speech — defined as violent or dehumanizing speech — based on race, gender, sexuality and other protected characteristics. It owns Instagram and has the same hate speech policies there. But before the overhaul, the company’s algorithms and policies did not make a distinction between groups that were more likely to be targets of hate speech versus those that have not been historically marginalized. Comments like “White people are stupid” were treated the same as anti-Semitic or racist slurs.
In the first phase of the project, which was announced internally to a small group in October, engineers said they had changed the company’s systems to deprioritize policing contemptuous comments about “Whites,” “men” and “Americans.” Facebook still considers such attacks to be hate speech, and users can still report it to the company. However, the company’s technology now treats them as “low-sensitivity” — or less likely to be harmful — so that they are no longer automatically deleted by the company’s algorithms. That means roughly 10,000 fewer posts are now being deleted each day, according to the documents.
The shift is a response to a racial reckoning within the company as well as years of criticism from civil rights advocates that content from Black users is disproportionately removed, particularly when they use the platform to describe experiences of discrimination.
Some civil rights advocates said the change was overdue.
“To me this is confirmation of what we’ve been demanding for years, an enforcement regime that takes power and historical dynamics into account,” said Arisha Hatch, vice president at the civil rights group Color of Change, who reviewed the documents on behalf of The Post but said she did not know about the changes.
“We know that hate speech targeted towards underrepresented groups can be the most harmful, which is why we have focused our technology on finding the hate speech that users and experts tell us is the most serious,” said Facebook spokeswoman Sally Aldous. “Over the past year, we’ve also updated our policies to catch more implicit hate speech, such as content depicting Blackface, stereotypes about Jewish people controlling the world, and banned Holocaust denial.”
Because describing experiences of discrimination can involve critiquing White people, Facebook’s algorithms often automatically removed that content, demonstrating the ways in which even advanced artificial intelligence can be overzealous in tackling nuanced topics.
“We can’t combat systemic racism if we can’t talk about it, and challenging white supremacy and White men is an important part of having dialogue about racism,” said Danielle Citron, a law professor specializing in free speech at Boston University Law School, who also reviewed the documents. “But you can’t have the conversation if it is being filtered out, bizarrely, by overly blunt hate speech algorithms.”
In addition to deleting comments protesting racism, Facebook’s approach has at times resulted in a stark contrast between its automated takedowns and users’ actual reports about hate speech. At the height of the nationwide protests in June over the killing of George Floyd, an unarmed Black man, for example, the top three derogatory terms Facebook’s automated systems removed were “white trash,” a gay slur and “cracker,” according to an internal chart obtained by The Post and first reported by NBC News in July. During that time period, slurs targeted at people in marginalized groups, including Blacks, Jews and transgender people, were taken down less frequently.
Aldous said the chart, which employees posted internally in June as a critique of how Facebook excessively defends White people, helped inform the WoW project.
“We run such qualitative checks frequently to investigate and fix any unintended outcomes of our enforcements,” she said.
As protests over Floyd’s death sparked national soul searching in June, Facebook employees raged against the company’s choices to leave up racially divisive comments by President Trump, who condemned protesters. They also debated the limits of personal expressions of solidarity, like allowing Black Lives Matter and Blue Lives Matter slogans as people’s internal profile pictures. Black employees met with senior executives to express frustration over the company’s policies.
In July, Facebook advertisers organized a high-profile boycott over civil rights issues, which put pressure on the company to improve its treatment of marginalized groups. It was also bitterly criticized by its own independent auditors in a searing civil rights report, which found Facebook’s hate speech policies to be a “tremendous setback” when it came to protecting its users of color. More than a dozen employees have quit in protest over the company’s policies on hate speech. An African American manager filed a civil rights complaint against the company in July, alleging racial bias in recruiting and hiring.
Facebook said it takes all allegations of discrimination seriously and is investigating the claims. The company said it is addressing the demands of civil rights advocates by pledging to increase diversity in its hiring and leadership, making content policy changes such as banning white nationalism, Holocaust denialism, the QAnon conspiracy theory and blackface, and sharing more data about its hate speech detection systems.
Complaints by Black users continue, with some saying they are seeing posts removed with increased frequency even as the WoW project gets underway.
In one instance in November, Facebook-owned Instagram removed a post from a man who asked his followers to “Thank a Black woman for saving our country,” according to screenshots posted by social media users at that time. The user received a notice that said “This Post Goes Against Our Community Guidelines” on hate speech, according to the screenshots.
“The rules [are] different for us, applied more harshly for us,” said Lace Watkins, who runs a 10,000-member Facebook page focused on anti-racism called Lace on Race. Watkins said she learned to avoid typing the word “white” when discussing racism because Facebook’s automatic detection systems would often take the word down. Instead Watkins says she and other Black users will type “wipipo” or write around it.
Tamela J. Gordon, a Miami-based writer who runs a closed group for about 115 Black women, said that after repeated flaggings from Facebook, she and her group members have also started to avoid the word “white,” as well as the word “Black” when capitalized, phrases that pair the words “hate” and “men” or “disgust” and “men,” and most recently the phrase “men are trash.”
“We are constantly getting reported, getting banned, for content shared within the group as well as content shared outside the group,” Gordon said. “In this group there are a lot of women in very desperate situations, dealing with things like domestic violence, extreme poverty. It’s hard knowing that we are being watched and that our expression is monitored and restricted.”
Facebook investigated complaints from Black users at least as far back as 2017, according to documents obtained by The Post.
Internal research at the time showed that Black and Hispanic users were the most engaged groups on Facebook, in terms of overall activity and the numbers of videos watched and uploaded. But Black users also had been raising public concerns about growing anti-Black hate speech on the platform, while their accounts were being suspended for discussing discrimination.
Because of these complaints, executives worried that frustrated users might decamp for a competing platform, such as Twitter or Snapchat, where they felt they could speak more freely, according to the documents and interviews with two Facebook employees who researched the issue.
Black employees advocated for Facebook to launch a study, dubbed Project Vibe, to deepen the company’s understanding of Black users and their experience on the platform. The project explored four hypotheses about Black users, including whether Black sentiment toward the company was at risk because of their perception that Facebook applies its hate speech policies unfairly. Project Vibe was sponsored by Chris Cox, Facebook’s chief product officer and a longtime confidant of CEO Mark Zuckerberg.
As part of the project, Facebook hired an external firm to interview Black users in their homes in New York City, Washington and Charlotte, according to the documents.
Mark S. Luckie, a former Facebook manager who also participated in the in-person interviews, said Black employees worked on Project Vibe part time in addition to their other duties. They wanted to show Facebook that attempting to treat all users equally was causing problems for its most active and vulnerable users. Their hope was that the findings could be shared with senior executives, as well as relevant product teams, who could build features, reach out to users or establish partnerships that would improve Facebook’s reputation within the Black community.
After Project Vibe was completed in early 2018, however, employees who participated in it grew frustrated with the lack of action.
Luckie, who worked in media partnerships, used data from Project Vibe to propose partnering with leading Black influencers, offering novel and rich Black content, and increasing the number of Black Facebook staffers. He said his manager was excited about the proposal, but when the idea was escalated to the next level, Luckie was told by three people that his initiative would not move forward.
“The results [from Project Vibe] reflected so negatively on Facebook that they didn’t want the study to be circulated around the company,” Luckie said.
Facebook said the analysis from Project Vibe was shared more broadly in 2020 and that the company eventually followed up on many of the proposals from the initiative under a new moniker, Project Blacklight, which kicked off in February 2019.
“The work was shared internally and helped to inform strategies and projects across Facebook such as increased investments in Black SMBs [small and midsized businesses], Black media partnerships and Black original content,” spokeswoman Bertie Thomson said in a statement. “It also led to a project to investigate hate speech false positives that helped to inform several hate speech pilots which have proven successful.”
Color of Change’s Hatch said that in 2018 — at the same time that Facebook’s own Project Vibe had confirmed frustrations from Black users — company officials dismissed her concerns about Black voices being shut down, saying the issues were not systemic and were one-off mistakes by moderators.
A year after Project Vibe, the company commissioned an independent civil rights audit. The audit was part of an effort to better understand and address broad problems, such as the spread of white nationalism and white supremacy on the platform, as well as the company’s standing with its most marginalized users.
The first section of the civil rights audit, published in summer 2019 in consultation with more than 90 civil rights groups, found that Facebook was “overly enforcing” its content policies by removing content where people spoke out about discrimination.
The auditors found fault with Facebook’s policies, but also with its enforcement from content moderators and technology.
Facebook’s content moderators were not specifically trained to deal with hate speech, which is generally more complex to police than other types of banned content, such as nudity or violence. (The company is now piloting hate speech specialization.) And moderators tasked with reviewing hate speech are not allowed to see key context around a post, such as comments, accompanying photos or a profile picture — information that would help a reviewer understand the intention of the comment. The company excludes the context to protect user privacy, hindering moderators‘ abilities to enforce its policies.
In mid-2019, Facebook began allowing algorithms to take down hate speech content automatically, without being first sent to a human reviewer. Such software was only able to proactively detect 65 percent of comments that company determined were hate speech at the time. Today the company says the number is 95 percent and that about 1 in every 1,000 comments seen on the platform are hate speech.
But Facebook does not share data on the accuracy of its hate speech algorithms or collect information on whether certain groups have their content removed more than others.
Facebook’s Aldous said the WoW project reflected an acknowledgment that underrepresented groups required more protection. She added that the company was taking the steps after extensive internal and external research, including focus groups toward the end of 2019 on what people perceive to be the most harmful hate speech and content.
These steps may be too late for some Black users, a number of whom said the problems seem to be getting worse this year, not better. Watkins said she was fed up with all the infractions and that she plans to move her Facebook group to her own website soon.
“Talking about White privilege, White supremacy and problematic aspects of Whiteness is part of the work of race theory,” said Frederick Joseph, an entrepreneur and author. He said his posts on his recently published book “The Black Friend: On Being a Better White Person” seem to get less promotion from Instagram than other posts.
“To not be able to speak about these things, as a non-White person, inherently stifles us and the world around us that we are trying to make better,” he added.