In the past month, we learned that Russian operatives spent at least $100,000 on Facebook advertisements. We also learned that Russian actors used Facebook to organize offline anti-immigration protests. Just last week, a study estimated that political posts by a Russian troll factory (the Internet Research Agency) had an “organic reach” of between 340 million and several billion views. This study has received enormous attention, including an article at The Washington Post’s Switch blog.
These are big numbers. But it can be easy to misinterpret them. The $100,000 in advertisements was a drop in the bucket compared to the $70 million directly spent by the Trump campaign. Barely anyone actually showed up to the anti-immigration protests. It turns out that the widely cited metrics for organic reach have a botnet problem.
Studying Facebook interactions is hard
The study arguing that Russian posts got billions of views comes from Jonathan Albright, research director at Columbia University’s Tow Center for Digital Journalism. Albright relied on CrowdTangle, a popular social media analytics tool for monitoring Facebook interactions and surfacing viral content. This is a novel and creative application of the CrowdTangle toolset, and not the purpose that CrowdTangle was designed for. Unfortunately, that may have led to problems in the analysis.
I studied CrowdTangle in my 2016 book, “Analytic Activism.” Here’s how the tool works:
“The logic of CrowdTangle’s model is relatively simple (even if the underlying math and software code gets complicated). CrowdTangle tracks clusters of Facebook pages and specific keywords. It gathers historical data on how stories, posts and images tend to perform on these sites, and then highlights the stories, posts and images that are doing best against their own expected baseline performance rate. [The] company then packages this information into a daily email, alerting [its] clients to the content which is likely to perform best on a day-to-day basis.”
CrowdTangle plays a crucial behind-the-scenes role in the social sharing optimization strategies of digital media producers like Upworthy, Vox and Buzzfeed. CrowdTangle was not designed to combat, weed out or even study botnets. It was designed to identify stories and content that performs better-than-average within a company’s peer network.
Albright’s study focuses on a pair of metrics that CrowdTangle generates on the basis of the data it gathers: “interactions” and “organic reach.” The trouble with digital indicators like these is that they are easy to inflate. We have seen this on Twitter, where nearly half of President Trump’s Twitter followers are fake accounts and bots. This may be a particular problem when studying Russian influence activities. Adrien Chen’s reporting has documented that Russia’s Internet Research Agency (IRA) specializes in creating fake social media accounts to magnify the impact of its activities. These fake accounts can warp these sorts of simple metrics of online opinion.
The ‘Blacktivists’ page shows how this can work
The “Blacktivists” Facebook page is an instructive example. This page was created by the IRA. Donie O’Sullivan and Dylan Byers have previously reported that the Blacktivist page “had 360,000 likes, more than the verified Black Lives Matter account on Facebook, which currently has just over 301,000.” We cannot tell based on public data how many of these likes came from IRA-created Facebook accounts. But it is highly likely that the reason IRA’s page had more likes than the actual Black Lives Matter account was because the IRA also fabricated several thousand Facebook profiles, then used those accounts to give its content a veneer of legitimacy.
The Albright study highlights the 6.18 million “interactions” (reactions, comments, and shares) the Blacktivist page received across 500 posts. The single most-shared post from the Blacktivist page received 344,209 interactions, which is less than the page’s total number of likes.
This could indicate massive viral spread among socially-conscious Facebook users. Or it could just be repeat sharing echoing across a botnet. The study also estimates the “organic reach” by counting the sum total of followers of all Facebook pages that shared a Blacktivist post. That sum is rife with overcounts though — if two fake Facebook profiles each have the same 5,000 fake friends, and both share the fake Blacktivist post, CrowdTangle’s “organic reach” will record it as visible to 10,000 people.
The headline from Albright’s study is that Blacktivist posts had a total “organic reach” of 103.8 million. Combined with five other IRA-created pages that have been made public, Albright counts 340 million. Since Facebook has deleted 470 pages, he reasonably concludes that the total “organic reach” of all these sites is likely in the billions. That math is correct, but misleading. We have no way of knowing what portion of these views are attributable to actual human beings living in the United States of America.
The larger difficulty here is that Facebook has quasi-monopolistic power in the social sharing economy. We encounter news content through the black box of Facebook’s newsfeed algorithms, and no one besides Facebook’s own engineers can say precisely how these algorithms operate. Facebook’s internal data might be able to sort through these questions, but even that is uncertain. Facebook’s publicly-available data is extremely limited. This makes it hard to sort out the scale of Russian digital propaganda activity in the 2016 election.
We know that foreign actors expended substantial time and money in an attempt to inject digital propaganda into the 2016 election. We know that they did this in an attempt to undermine trust in democracy and buttress the efforts of the Trump campaign.
Gathering clear data on the scope of these activities is both phenomenally important and phenomenally difficult. Albright’s effort to shed light on this activity is to be applauded. Still, readers need to be cautioned not to overhype the topline findings, which may inadvertently be highly misleading.
This article is one in a series supported by the MacArthur Foundation Research Network on Opening Governance that seeks to work collaboratively to increase our understanding of how to design more effective and legitimate democratic institutions using new technologies and new methods. Neither the MacArthur Foundation nor the Network is responsible for the article’s specific content. Other posts in the series can be found here.
Dave Karpf is an associate professor in the School of Media and Public Affairs at George Washington University.