Two months ago, Google promoted Timnit Gebru, co-lead of a group focused on ethical artificial intelligence, after she earned a high score on her annual employee appraisal. Gebru is one of the most high-profile Black women in her field and a powerful voice in the new field of ethical AI, which seeks to identify issues around bias, fairness, and responsibility.

In his peer review of Gebru, Jeff Dean, the head of Google Artificial Intelligence, left only one comment when asked what she could do to have a greater impact, according to documents viewed by The Washington Post: Ensure that her team helps make a promising new software tool for processing human language “consistent with our AI Principles.”

In an email thanking Dean for his review, Gebru let him know that her team was already working on a paper about the ethical risks around the same language models, which are essential to understanding the complexity of language in search queries. On Oct. 20, Dean wrote that he wanted to see a draft, adding, “definitely not my area of expertise, but would definitely learn from reading it.”

Six weeks later, Google fired Gebru while she was on vacation.

“I can’t imagine anybody else who would be safer than me,” Gebru, 37, said. “I was super visible. I’m well known in the research community, but also the regulatory space. I have a lot of grass-roots support — and this is what happened.”

In an internal memo that he later posted online explaining Gebru’s departure, Dean told employees that the paper “didn’t meet our bar for publication” and “ignored too much relevant research” on recent positive improvements to the technology. Gebru’s superiors had insisted that she and the other Google co-authors either retract the paper or remove their names. Employees in Google Research, the department that houses the ethical AI team, say authors who make claims about the benefits of large language models have not received the same scrutiny during the approval process as those who highlight the shortcomings.

Her abrupt firing shows that Google is pushing back on the kind of scrutiny that it claims to welcome, according to interviews with Gebru, current Google employees, and emails and documents viewed by The Post.

It raises doubts about Silicon Valley’s ability to self-police, especially when it comes to advanced technology that is largely unregulated and being deployed in the real world despite demonstrable bias toward marginalized groups. Already, AI systems shape decision-making in law enforcement, employment opportunity and access to health care worldwide.

That made Gebru’s perspective essential in a field that is predominantly White, Asian and male. Women made up only 15 percent of the AI research staff at Facebook and 10 percent at Google, according to a 2018 report in Wired magazine. At Google, Black women make up 1.6 percent of the workforce.

Although Google publicly celebrated Gebru’s work identifying problems with AI, it disenfranchised the work internally by keeping it hierarchically distinct from other AI initiatives, not heeding the group’s advice, and not creating an incentive structure to put in practice the ethical findings, Gebru and other employees said.

Google declined to comment, but noted that in addition to the dozen or so staff members on Gebru’s team, 200 employees are focused on responsible AI.

Google has said that it did not fire Gebru, but accepted her “resignation,” citing her request to explain who at Google demanded that the paper be retracted, according to Dean’s memo. The company also blamed an email Gebru wrote to an employee resource group for women and allies at Google working in AI as inappropriate for a manager. The message warned the group that pushing for diversity was no use until Google leadership took accountability.

Rumman Chowdhury, a former global lead for responsible AI at Accenture and chief executive of Parity, a start-up that helps companies figure out how to audit algorithms, said there is a fundamental lack of respect within the industry for work on AI ethics compared with equivalent roles in other industries, such as model risk managers in quantitative hedge funds or threat analysts in cybersecurity.

“It’s being framed as the AI optimists and the people really building the stuff [versus] the rest of us negative Nellies, raining on their parade,” Chowdhury said. “You can’t help but notice, it’s like the boys will make the toys and then the girls will have to clean up.”

Google, which for decades evangelized an office culture that embraced employee dissent, has fired outspoken workers in recent years and shut down forums for exchange and questioning.

Nearly 3,000 Google employees and more than 4,000 academics, engineers and industry colleagues have signed a petition calling Gebru’s termination an act of retaliation by Google. Last week, nine Democratic lawmakers, including Sens. Elizabeth Warren (Mass.) and Cory Booker (N.J.) and Rep. Yvette D. Clarke (N.Y.), sponsor of the Algorithmic Accountability Act, a bill that would require companies to audit and correct race and gender bias in its algorithms, sent a letter to Google chief executive Sundar Pichai asking the company to affirm its commitment to research freedom and diversity.

Like any good researcher, Gebru is comfortable in the gray areas. And she has been using her ouster as an opportunity to shed light on the black box of algorithmic accountability inside Google — annotating the company’s claims with contradictory data, drawing connections to larger systemic issues, and illuminating the way internal AI ethics efforts can break down without oversight or a change in incentives to corporate practices and power structures.

Big Tech dominates AI research around advancements in machine learning, image recognition, language translation — poaching talent from top universities, sponsoring conferences and publishing influential papers. In response to concerns about the way those technologies could be abused or compound bias, the industry ramped up funding and promotion of AI ethics initiatives, beginning around 2016.

Tech giants have made similar investments in shaping policy debate around antitrust and online privacy, as a way to ward off lawmakers. Pichai invoked Google’s AI principles in an interview in 2018 with The Post, arguing for self-regulation around AI.

Google created its Ethical AI group in 2018 as an outgrowth of an employee-led push to prioritize fairness in the company’s machine learning applications. Margaret Mitchell, Gebru’s co-lead, pitched the idea of a team of researchers investigating the long-term effects of AI and translating those findings into action to mitigate harm and risk.

The same year, Pichai released a broadly worded set of principles governing Google’s AI work after thousands of employees protested the company’s contract with the Pentagon to analyze surveillance imagery from drones. But Google, which requires privacy and security tests before any product launch, has not mandated an equivalent process for vetting AI ethics, employees say.

Gebru, whose family’s ethnic origins are in Eritrea, was born and raised in Ethiopia and came to Massachusetts as 16-year-old after receiving political asylum from the war between the two African countries. She began her career as an electrical engineer at Apple and received her PhD from the Stanford Artificial Intelligence Lab, studying computer vision under renowned computer scientist Fei-Fei Li, a former Google executive and now co-director of Stanford’s Human-Centered AI Institute, which receives funding from Google.

Gebru did her postdoctoral research at Microsoft Research as part of a group focused on accountability and ethics in AI. There, she and Joy Buolamwini, then a masters student at MIT Media Lab, co-wrote a groundbreaking 2018 study that found that commercial facial recognition tools sold by companies such as IBM and Microsoft were 99 percent accurate at identifying White males, but only 35 percent effective with Black women.

In June, IBM, Microsoft and Amazon announced that they would stop selling the software to law enforcement, which Dean credited to Gebru’s work. She also co-founded Black in AI, a nonprofit organization that increased the number Black attendees at the largest annual AI conference.

Compare Google search engine results over nearly two decades and a trend emerges: Results are filled with advertising and non-Google results are lower down. (The Washington Post)

Gebru said that in 2018, Google recruited her with the promise of total academic freedom. She was unconvinced, but the company was opening its first artificial intelligence lab on the African continent in Accra, the capital of Ghana, and she wanted to be involved. When she joined Google, Gebru said, she was the first Black female researcher in the company. (When she left, there were still only a handful of Black women working in research, out of hundreds.)

Gebru said she was also drawn to working with Mitchell. Both women prioritized foresight and building practical solutions to prevent AI risk, whereas the operating mind-set in tech is biased toward benefits and “rapid hindsight,” in response to harm, Mitchell said.

Gebru’s approach to ethical AI was shaped by her experiences. Hardware, for instance, came with datasheets that documented whether components were safe to use in certain situations. “When you look at this field as a whole, that doesn’t exist,” said Gebru, an electrical engineer. “It’s just super behind in terms of documentation and standards of safety.”

She also leaned on her industry experience when collaborating with other teams. Engineers live on a product cycle, consumed with putting out fires and fixing bugs. A vague requirement to “make things fair” would only cause more work and frustration, she thought. So she tried to build institutional structures and documentation tools for “when people want to do the right thing.”

Despite their expertise, the Ethical AI group fought to be taken seriously and included in Google’s other AI efforts, employees said.

Within the company, Gebru and her former colleagues said, there is little transparency or accountability regarding how decisions around AI ethics or diversity initiatives get made. Work on AI principles, for instance, falls under Kent Walker, the senior vice president of global affairs, whose vast purview includes lobbying, public policy and legal work. Walker also runs an internal ethics board of top executives, including Dean, called the Advanced Technology Review Council, which is responsible for yes or no decisions when issues escalate, Gebru said. The Ethical AI team had to fight to be consulted on Walker’s initiatives, she said.

“Here’s the guy tasked with covering Google’s a--, lobbying and also … working on AI principles,” Gebru said. “Shouldn’t you have a different entity that pushes back a little bit internally — some sort of push and pull?” What’s more, members of Walker’s council are predominantly vice presidents or higher, constricting diversity, Gebru said.

In her conversations with product teams, such as a group working on fairness in Machine Learning Infrastructure, Gebru said she kept getting questions about what tools and features they could build to protect against the ethical risks involved with large language models. Google had credited it with the biggest breakthrough in improving search results in the past five years. The models can process words in relation to the other words that come before and after them, which is useful for understanding the intent behind conversational search queries.

But despite the increasing use of these models, there was limited research investigating groups that might be negatively impacted. Gebru says she wanted to help develop those safeguards, one of the reasons she agreed to collaborate with the research paper proposed by Emily M. Bender, a linguist at the University of Washington.

Mitchell, who developed the idea of model cards, like nutrition labels for machine learning models, described the paper as “due diligence.” Her model card idea is being adopted more widely across the industry, and engineers needed to know how to fill out the section on harm.

Gebru said her biggest contribution to both her team and the paper has been identifying researchers who study directly-affected communities.

That diversity was reflected in the authors of the paper, including Mark Diaz, a Black and Latino Google researcher whose previous work looked at how platforms leave out the elderly, who talk about ageism in blog posts, but don’t share as much on sites such as Twitter. For the paper, he identified the possibility that large data sets from the Internet, particularly if they are from a single moment in time, will not reflect cultural shifts from social movements, such as the #MeToo movement or Black Lives Matter, which seek to shift power through changes in language.

The paper identified four overarching categories of harm, according to a recent draft viewed by The Post. It delved into the environmental effect of the computing power, the inscrutability of massive data sets used to train these models, the opportunity costs of the “hype” around claims that these models can understand language, as opposed to identifying patterns, and the danger that the real-sounding text generated by such models could be used to spread misinformation.

Because Google depends on large language models, Gebru and Mitchell expected that the company might push back against certain sections or attempt to water down their findings. So they looped in PR & Policy representatives in mid-September, with plenty of time before the deadline for changes at the end of January 2021.

Before making a pre-publication draft available online, Gebru first wanted to vet the paper with a variety of experts, including those who have built large language models. She asked for feedback from two top people at OpenAI, an AI research lab co-founded by Elon Musk, in addition to her manager at Google, and about 30 others. They suggested additions or revisions, Gebru said. “I really wanted to send it to people who would disagree with our view and be defensive,” Gebru said.

Given all their upfront effort, Gebru was baffled when she received a notification for a meeting with Google Research Vice President Megan Kacholia at 4.30 p.m. on the Thursday before Thanksgiving.

At the meeting, Kacholia informed Gebru and her co-authors that Google wanted the paper retracted.

On Thanksgiving, a week after the meeting, Gebru composed a six-page email to Kacholia and Dean outlining how disrespectful and oddly secretive the process had been.

“Specific individuals should not be empowered to unilaterally shut down work in such a disrespectful manner,” she wrote, adding that researchers from underrepresented groups were mistreated.

Mitchell, who is White, said she shared Gebru’s concerns but did not receive the same treatment from the company. “Google is very hierarchical, and it’s been a battle to have any sort of recognition,” she said. “We tried to explain that Timnit, and to a lesser extent me, are respected voices publicly, but we could not communicate upwards.”

“How can you still ask why there aren’t Black women in this industry?” Gebru said.

Gebru said she found out this week that the paper was accepted to the Conference on Fairness, Accountability and Transparency, as part of its anonymous review process. “It’s sad, the scientific community respects us a lot more than anybody inside Google,” she said.