Wael Ghonim is an Internet activist and a Fellow at Harvard’s Shorenstein Center. Jake Rashbass is a Frank Knox Fellow at the Harvard Kennedy School.
Each time you log into a social media platform, its algorithms — sophisticated mathematical models designed by a few thousand engineers in Northern California — decide what information you should consume. These decisions have deep repercussions on us as individuals and as a society, whether it’s by enabling revolutions that topple dictators, by feeding us fake news and micro-targeted advertisements to manipulate election results or by preying on our insecurities to fuel mental health crises.
Yet, as recent social media scandals have shown, we are continually discovering the consequences of these decisions only once the damage is done.
There is a reason for this. The algorithms aspire to give each user a personalized experience, presenting them with whatever content they are most likely to like, share or comment on. This hyper-personalization allows platforms to maximize user engagement and generate profits through targeted advertising, but it also means we have little idea of what other users are consuming. Because we don’t know how users are collectively experiencing these platforms and what information the sites are circulating, we aren’t able to hold accountable those who spread fake news and misinformation until it’s too late.
The time has come to end the opacity and secrecy surrounding social media. If social media platforms are truly committed to being the productive, responsible and ethical force in society they have the potential to be, there are crucial steps they must take. Before anything else, we need far more transparency of the outputs produced by these algorithms so we can create an effective accountability mechanism. Data that social media companies currently share with researchers and other interested parties is inadequate, inconsistent and entirely at each company’s discretion. The recent steps by Twitter and Facebook to improve transparency certainly move in the right direction, but they do not go far enough. We are in urgent need of a broader strategy to address these challenges.
Here’s our suggestion. We believe that all platforms using algorithms to distribute content must develop a standardized public interest API (a standard interface for sharing and accessing data) that provides a detailed overview of the information distributed on their networks, while respecting concerns for user privacy, trade secrets and intellectual property. There are three categories of data that need to be shared:
First, public posts. Malign actors spread misinformation and sensationalism on platforms in the assumption that they can engage large audiences without being held accountable. To counter this, platforms should make available data for all public posts, whether created by an individual user, group or page. This data need to include reach and engagement figures and provide a demographic breakdown of its audience.
The API should also disclose the top trending stories in different geographic and demographic groupings, and identify the influencers that publicly enabled a post to achieve viral status. This will allow third parties, like journalists and researchers, to identify new trends surfacing across the platform, and hold platforms accountable to address harmful ones in real time.
Second, information on ads. Long gone are the days of uniform print and broadcast media where all advertisements were clear for anyone to see. With the advent of micro-targeting and “dark ads” on social media, we no longer know who is propagating what information to whom. This has huge implications for our ability to detect false advertising, political smear campaigns and election manipulation. Platforms need to reveal through the public interest API who is purchasing ads, which groups they are targeting and the contents of these ads.
And third, censored content. All social media platforms have policies for censoring content that violates their terms of usage. Even so, their algorithms distribute prohibited content for an unknown period of time before it is discovered and removed. It is in the public interest to know what content is being deleted and what reach it had prior to deletion.
Platforms can use the public interest API to reveal the contents and origins of deleted materials, the amount of time the content was distributed on the platform prior to deletion and what reach and engagement that material achieved during that time. Public access to this data will pressure social media companies to move more quickly to delete content that violates their policies and to preserve it for future research. It will also enable journalists and researchers to understand what sorts of content are most frequently censored and verify that platforms aren’t censoring content beyond the remit of their policies.
We hope the model of a public interest API we present here will further the conversation that addresses the realities of social media today so these platforms can be the positive force in the world they aspire to be. Although transparency isn’t the ultimate answer to these problems, its absence is a huge obstacle to any sustainable solution. The leaders of Facebook, YouTube and Twitter have all proclaimed the value of transparency. It is now time for these companies to match their words with actions that acknowledge their true impact on society. If not, it’s only a matter of time before governments intervene with regulation.