“I had always thought that Reddit’s front pages operated as some kind of direct democracy,” Schneider wrote on his blog. “I was surprised to learn that’s not actually the case.”
It’s not, Reddit admin Chad Birch confirmed on the site itself. Far from relying on raw user votes, the site actually uses a multipart normalization algorithm to get a good mix of content on the front page. As a consequence, of course, a great deal of content that’s very popular with the community never makes it to the front page, at all.
To be clear at the top, this isn’t a scandal or cover-up or huge reveal or anything: Reddit’s code is open source, so anyone with a Github account and a little technological know-how could have sussed out the fact that a normalization algorithm is at play. (A little critical thinking should have turned that up, too: If there weren’t an algorithm guiding the front page, only stuff from the most popular subreddits would ever appear there.) But that said, Schneider’s discovery definitely defies layman assumptions of how Reddit works — and on what principles.
In late summer, after one of his animations hit Reddit’s front page, Schneider devised a bit of code to help figure out exactly how that happened. For six weeks, his script logged the top 100 posts on Reddit every five minutes, dumping the data into a database. When the experiment concluded, Schneider was able to chart the lifecycle of each popular Reddit post, almost like Billboard would chart a song.
He found that not all posts charted equally. Instead, Reddit appeared to suppress the success of content from its largest, most popular subreddits, and elevate content from smaller ones. In fact, Schneider found, each post’s chance of hitting the front page — and with that, going viral on the wider Internet — varied hugely by which forum it started from and where it fell in relation to other posts.
“I’d be curious to see what would happen if all subreddits were treated equally,” Schneider wrote. “My guess is that the reddit default top 100 would contain an even higher rate of funny pictures, but who knows, maybe there’d be some unintended side effects that would lead people to upvote more varied content.”
We don’t know, of course, but recent history would suggest that people tend to fall for the lowest common denominator when given the option. In August, Facebook made changes to its own algorithm to block the very universe of things that Schneider saw screened out of the Reddit homepage. At the time, Facebook explained the changes as an attempt to improve the News Feed. And by all accounts, that’s what Reddit’s doing, too — stepping on the content scales ever so slightly, just to save us from our GIF-loving, Earth-porning, clickbaiting selves.
Algorithms are necessary and important, in many ways: Algorithms order things. Without them we’d be inundated with such a senseless, noisy flood of information that we’d sooner drown than rise up through it; imagine Google searches, except without the relevant results posted first, or a Facebook timeline that showed every single last status update that all 512 of your friends/relatives/exes/long-forgotten classmates make.
You get the idea: That would be terrible! Both sites would be ghost towns.
And yet, Reddit’s algorithms surprise, even discomfit, for a couple reasons. Many users aren’t aware the algorithms are even in operation. (Schneider says none of the users he asked knew about it.) And the front page quota system, together with the well-publicized political wrangling of Reddit’s volunteer moderators, make pretty convincing evidence that “the crowd” has markedly less power than it thinks.
On top of that, there’s a growing sense of anxiety around algorithms on social networks in general, no matter how practical they may be. It’s in our best interest, of course, to see the Yelp reviews by the most reliable writers, and the Amazon products we’re most likely to buy. But increasingly, algorithms filter — and in a substantial way, control — just about everything we see online. There are precious few platforms where that is not the case. (Even Twitter, one of the last mainstream holdouts, recently indicated it was considering a filtered feed.) And that is insidious, really, because you don’t know what you’re not seeing — and in many cases, you don’t realize you’re missing anything, at all.
On Reddit, one user seemed to circle that very question, asking Birch what he thought about the patterns that Schneider’s research turned up. “Do they jive with the vision of Reddit you have?” he asked.
“I think it’s hard to say,” Birch wrote back. “It’s not a simple problem to solve, and it really depends how you want things to behave.”