Screenshots and data used in Citizen Lab report: Tracing the Path of a Censored Weibo Post and Compiling Keywords that Trigger Automatic Review
#Data
- Part 1: Weibo search data of CDT keywords, tested May 2014 and Nov 2014 (CSV)
- Part 2 & 3: [Weibo censorship testing data of probable automatic review keywords, tested Nov 8, 2014 and Nov 10, 2014
- 66 keywords which cannot be posted to Weibo according to our preliminary test (explicit filtering)
- 14 keywords which return a data error (implicit filtering)
- 133 keywords which cause posts to be invisible and camouflaged
#Screenshots:
- Explicit filtering message (box 1 in Figure 1): Chinese | English
- Implicit filtering message (box 3 in Figure 1): Chinese | English
- Comparison of missing messages in timeline due to “camouflaged”/invisible posts (box 2A1 in Figure 1): own timeline vs when viewed by another user
- Message when visiting deleted weibo post (box 4a in Figure 1): screenshot
- Post deletion message in inbox (box 4a in Figure 1): screenshot
- Account warning, 48 hour ban (box 4C in Figure 1): screenshot
- Account abnormal notice (box 4C in Figure 1): screenshot
Fig 1. Possible pathways for a Weibo post
#Summary of Report
- Part 1: Weibo has removed their conventional censorship notice from searches on the site. This may be a bellwether for enhanced censorship on the site due to a particularly sensitive period in Chinese politics or it may mark a shift toward more obscured censorship. Or it may simply be Weibo testing out new tactics.
- Part 2: Automatic review of content refers to moderating messages before they get circulated widely. A number of studies have described these mechanisms, which include keywords which trigger your post to become hidden or your inability to post in the first place. We sought to outline with more clarity the pathways a Weibo post might take—from submission to potential censorship.
- Part 3: We accomplished the above by posting sensitive keywords and tracking how they were censored or not based on a number of factors. This is only a preliminary set of tests and will hopefully serve as a basic methodology for others who are interested to generate more rigorous testing.
- Data and screenshots: We’ve posted to Github the data used for this test, lists of suspected keywords which trigger automatic review, as well as the screenshots of the various censorship messages.