The Washington PostDemocracy Dies in Darkness

How China censors 100 million tweets per day

People use computers at an Internet cafe in central China's Anhui province. (AP Photo)
Placeholder while article actions load

A new study by researchers at Rice, Bowdoin and the University of New Mexico sheds unprecedented light on how China censors the 300 million users and 100 million daily messages of its Twitter-like platform, Weibo.

Previously, we knew what terms censors were most likely to snag and which provinces got impacted most. But the new study pinpoints exactly how long it takes censors to take down a post (generally, 5 to 10 minutes) and makes some compelling guesses for how they do it. The researchers gathered their data from more than 3,500 frequently censored Weibo users over a period of a month.

They found that censors took down 12 percent of total posts.

Censoring Weibo is, obviously, a profound logistical undertaking -- the site sees nearly five times as many posts in the average minute than Twitter did during the State of the Union. The researchers have a few hypothesis for how that happens:

  1. Explicit filtering: a banned keyword triggers an automated system, which stops the message from posting and warns the user he has violated policy.
  2. Implicit filtering: a banned keyword triggers an automated system, which delays the message until a censor can see it and tells the user there's a server error in the meantime.
  3. Camouflaged posts: a banned keyword triggers an automated system, which keeps the message from displaying publicly but shows the user it has posted.
  4. Backwards repost search: either a human censor or an automated system discovers a problematic posts and deletes all versions of it (re-posts, etc.) across the network.
  5. Backwards keyword search: a censor notices a problematic keyword and deletes a number of its instances across the network.
  6. User monitoring: certain users who are censored frequently are flagged for closer scrutiny.
  7. Account closures: censors shut down problematic accounts entirely. The study counted 300 such closures of 3,500 accounts in a one-month period.
  8. Search filtering: a regularly updated list of terms cannot be searched.
  9. Public timeline filtering: sensitive topics are edited out of the general Weibo "fire hose."

Among the keywords that could trigger a deletion? "Support Syrian rebels," "lying of government" and "Beijing rainstorms," the study reports. (The full list does not look thematically different from a list of terms used to filter the Chinese Internet overall, obtained by the Post in 2006.) The rainstorms caused widespread destruction and anti-government outrage in July 2012, and China officially supports the Syrian regime.

A few other factors make a user more likely to get censored, as well: the time of day, for instance, or whether the user regularly posts inflammatory things. Interestingly, the researchers found that censorship times drag at night and around 7 p.m., when a national news program airs.

As for how many censors are employed watching Weibo, that is hard to say. The paper calculates that, if real people do indeed check every post, there should be roughly 4,200.