Data analyzed for the ProPublica-Washington Post examination of Facebook posts was collected from over 100,000 public Facebook groups tracked between January 2020 and June 2021 by CounterAction, a firm that studies online disinformation.
Many of these groups disappeared from public view during the period of our analysis. To determine when groups focused on U.S. politics within our data set went offline, we analyzed the more than 5,000 groups that had meaningful activity (more than 10 posts tracked) but were no longer online as of Aug. 30, 2021. We hand-labeled each group as political if its name and description showed that it was created to represent or support a U.S. political interest or group, to be a forum for U.S. political speech, or to represent or discuss a social or cultural movement with a strong connection to U.S. politics (national or local). We ultimately found more than 2,500 such groups, including those for and against various parties, candidates and issues across the political spectrum; groups for various kinds of political memes and discussions; and groups for movements such as the QAnon conspiracy theory, militia groups and Stop the Steal.
We then estimated the time of disappearance for each of these 2,500-plus U.S. political groups by taking the latest date seen on their posts and other group activity. Based on our reporting and the timing of spikes in group disappearances, which often coincided with Facebook’s announcements of group suspensions, we believe the majority of them were removed by Facebook. However, some may have been deleted or removed from public view by their own administrators. We shared the list of more than 2,500 groups with Facebook and asked it to clarify whether they were removed by the company or taken offline by their own administrators. Facebook did not respond to our questions about these groups or any other of our quantitative findings.
We used these labeled offline groups to predict which of the still-online groups within our sample were also U.S. political groups. We used posts from the offline groups to train a text classification model to predict whether a post was from a U.S. political group and ran it against all the posts from each group in our data set. We labeled a group as a likely U.S. political Facebook group when the mean prediction for its posts was over 0.5 (1.0 indicates that the model predicts with maximum probability that the post is from a U.S. political group). We used this labeling method to identify over 27,000 likely U.S. political groups with posts between Election Day and Jan 6. We hand-checked a sample of the groups to calculate an estimate of the proportion of groups that were actually U.S. political groups, and got a precision rate of about 79 percent.
To count the number of posts that specifically sought to delegitimize the election results, we examined 18.7 million posts from Election Day through Jan. 6 within the likely U.S. political Facebook groups. We separated out posts from groups with “Stop the Steal” in their name and calculated which keywords and phrases were disproportionately common in posts from those groups using a text-analysis technique called TF-IDF. Then, we handpicked the terms and keywords that were meaningfully linked to election delegitimization theories (e.g., “stop the steal,” “steal the election,” “every legal vote”). We had about 60 terms that indicated delegitimization on their own, plus 86 more in two buckets that, if terms from both buckets were present, indicated delegitimization (e.g., a reference to absentee ballots on its own did not indicate delegitimization, but a reference to “absentee ballots” and “fraud” did.) We identified about 1.03 million posts that likely referenced delegitimization. Finally, we hand-checked a sample of these posts to estimate the proportion that actually sought to delegitimize the election, and got a precision rate of about 64 percent. (False positives included mainstream news articles, debunks of fraud claims, and references to other countries’ elections.) We arrived at our final estimate of delegitimizing posts by multiplying the two together, to get an estimate of a bit more than 655,000.
Because of CounterAction’s sampling method, the groups we analyzed are likely to contain a greater proportion of right-wing groups than the platform as a whole. The activity of the right-wing groups we analyzed matches with the findings of our reporting, and group activity in our sample coincided with Facebook’s public announcements about group removals. However, we would need additional outside data to analyze whether groups in our sample are representative of the broader platform. We sampled and checked precision rates in our analysis based on a 5 percent margin of error and 95 percent confidence level.