1. What’s an algorithm?
That’s simple: It’s a formula for processing information or performing a task. Arranging names in alphabetical order is a kind of algorithm; so is a recipe for making chocolate chip cookies. But they’re usually far more complex: companies like Facebook Inc. and Alphabet Inc.’s Google have spent billions developing the algorithms they use to sort through oceans of information and zealously guard the secrets of their software.
2. How can algorithms be biased?
Software engineers may not anticipate when their programs inadvertently discriminate against people. Facebook, for instance, was embarrassed in 2015 when some Native Americans were blocked from signing up for accounts because the software thought their names -- including Lance Browneyes and Dana Lone Hill -- were fake. Amazon.com Inc. ran into problems in 2015 when an artificial intelligence system it was testing to screen job applicants “taught” itself to weed out women by looking for certain keywords on resumes.
3. Where do algorithms get their data?
Every time you log onto an app, buy something online or view an ad on your phone, you leave behind a trail of information about your activities and interests. That data is gobbled up by companies everywhere -- and the more you use the web and social networks, the more Google, Facebook and other internet companies know about you. Then, of course, there are the reams of data collected via more conventional means -- voter rolls, driver’s licenses, magazine subscriptions, credit card purchases -- that can be cross-linked with online information to paint a complete profile of individuals.
4. How can that result in bias?
Data itself isn’t inherently discriminatory. The problem arises in how it’s used and interpreted -- especially when algorithms characterize people via correlations or “proxy” data. For instance, it’s illegal in the U.S. to make hiring or lending decisions based on race, gender, age or sexuality, but there are proxies for these attributes in big data sets. The music you stream on YouTube can suggest when you grew up, while membership in a sorority gives away your gender. Living in certain census tracts may hint at your racial or ethnic heritage. A study published in 2017 found that Facebook had classified some users as gay based on which posts they “liked,” even if the people hadn’t openly identified themselves as such.
5. What’s the problem with proxies?
Consider online job-finding services. Researchers have documented that they’re less likely to refer opportunities for high-paying positions to women and people of color because those job-seekers don’t match the typical profile of people in those jobs -- mostly white men. Systems like these use a technique called “predictive modeling” that makes inferences from historic patterns in data. They can go astray when the data is used wrongly or doesn’t accurately represent the community in question. A study from the University of California, Berkeley found that algorithmic lending systems were 40% less discriminatory than face-to-face interactions but still tended to charge higher interest rates to Latin and African-American borrowers. One reason: their profiles suggested they didn’t shop as much as other people.
6. How does bias get amplified in algorithms?
When data is misused, software can compound stereotypes or arrive at false conclusions. The city of Chicago announced plans in 2017 to employ “predictive policing” software to assign additional officers to areas more likely to experience violent crime. The problem was that the model directed resources to neighborhoods that already had the largest police presence -- in effect, reinforcing the existing human biases of the cops. Similar problems have surfaced with programs that evaluate criminals. Police in Durham, England used data from credit-scoring agency Experian, including income levels and purchasing patterns, to predict recidivism rates for people who had been arrested. The results suggested, inaccurately, that people from socio-economically disadvantaged backgrounds were more likely to commit further crimes.
7. What about facial recognition?
Facial recognition systems, which use digital cameras and databases of stored photos to identify people, have been plagued by accusations of bias. The most common complaint is that they do a poor job of correctly identifying people with darker skin, usually because they’ve been “trained” using stock images of predominantly white people. An MIT study found that inadequate sample diversity undermined recognition systems from IBM, Microsoft and Face Plus Plus. Darker-skinned women were the most misclassified group, with error rates around 35%, compared with a maximum error rate for lighter-skinned men of less than 1%.
8. What’s being done to regulate this?
Efforts are underway all over the world. U.S. House and Senate committees are currently reviewing a bill called the Algorithmic Accountability Act of 2019 that would require companies to test algorithms for bias. The U.K.’s Centre for Data Ethics and Innovation, a government-commissioned group comprised of technology experts, policy makers, and lawyers, is working on a report due next March that’s expected to call for stronger regulation and a universal code of ethics for algorithmic integrity. The EU’s General Data Protection Regulation, which went into effect this year, gives citizens the right to choose what data they provide and the means to obtain explanations for algorithmic decisions.
To contact the reporter on this story: Ali Ingersoll in London at firstname.lastname@example.org
To contact the editors responsible for this story: Giles Turner at email@example.com, Andy Reinhardt, Molly Schuetz
©2019 Bloomberg L.P.