Like it or not, much of what we encounter online is mediated by computer-run algorithms — complex formulas that help determine our Facebook feeds, Netflix recommendations, Spotify playlists or Google ads.
But algorithms, like humans, can make mistakes. Last month, users found the photo-sharing site Flickr's new image-recognition technology was labeling dark-skinned people as "apes" and auto-tagging photos of Nazi concentration camps as "jungle gym" and "sport."
How does this happen? Zeynep Tufekci, an assistant professor at the University of North Carolina at Chapel Hill's School of Information and Library Science, tells NPR's Arun Rath that biases can enter algorithms in various ways — not just intentionally.
"More often," she says, "they come through the complexity of the program and the limits of the data they have. And if there are some imperfections in your data — and there always [are] — that's going to be reflected as a bias in your system."
Interview Highlights
On bias in the Facebook "environment"
These systems have very limited input capacity. So for example, on Facebook, which is most people's experience with an algorithm, the only thing you can do to signal to the algorithm that you care about something is to either click on "Like" or to comment on it. The algorithm, by forcing me to only "Like" something, it's creating an environment — to be honest, my Facebook is full of babies and engagements and happy vacations, which I don't mind. I mean, I like that. When I see it, I click on "Like" — and then Facebook shows me more babies.
And it doesn't show me the desperate, sad news that I also care about a lot, that might be coming from a friend who doesn't have "likable" news.
How biases creep into computer code
One, they can be programmed in directly, but I think that's rare. I don't think programmers sit around thinking, you know, "Let us make life hard for a certain group" or not. More often, they come through the complexity of the program and the limits of the data they have. And if there are some imperfections in your data — and there always [are] — that's going to be reflected as a bias in your system.
Sometimes [biases] can come in through the confusing complexity. A modern program can be so multi-branch that no one person has all the scenarios in their head.
For example, increasingly, hiring is being done by algorithms. And an algorithm that looks at your social media output can figure out fairly reliably if you are likely to have a depressive episode in the next six months — before you've exhibited any clinical signs. So it's completely possible for a hiring algorithm to discriminate and not hire people who might be in that category.
It's also possible that the programmers and the hiring committee [have] no idea that's what's going on. All they know is, well, maybe we'll have lower turnover. They can test that. So there's these subtle but crucial biases that can creep into these systems that we need to talk about.
How to limit human bias in computer programs
We can test it under many different scenarios. We can look at the results and see if there's discrimination patterns. In the same way that we try to judge decision-making in many fields, when the decision making is done by humans, we should apply a similar critical lens — but with a computational bent to it, too.
The fear I have is that every time this is talked about, people talk about it as if it's math or physics, therefore some natural, neutral world. And they're programs! They're complex programs. They're not like laws of physics or laws of nature. They're created by us. We should look into what they do and not let them do everything. We should make those decisions explicitly.
Copyright 2020 NPR. To see more, visit https://www.npr.org.