A protester chants during a demonstration against police violence and the shooting of Michael Brown, inside the Rotunda at the State Capitol Building in Jefferson City, Mo.,  Dec. 5, 2014. A grand jury declined to indict white policeman Darren Wilson for the killing of unarmed black teenager Brown in Ferguson, Mo., spurring rioting in the St. Louis suburb. REUTERS/Jim Young 

The following is a guest post by Northeastern University political scientist Nick Beauchamp.


How do people interpret the facts so differently, even when they observe the same event? With the recent grand jury decision not to indict New York City police officer Daniel Pantaleo in the death of Eric Garner, national attention has returned to the ambiguities of the witness statements that play such a large role in grand jury decisions. The New York case is especially frustrating because the events were so well documented by video, yet nonetheless, interpretations are highly polarized, just as they were in the aftermath of Ferguson.

Named after a 1950 Kurasawa film, the “Rashomon effect” is a well known problem in criminal justice, where self-serving biases of participants and witnesses color their testimony, even if unintentionally. This phenomenon has played out in full force in the Ferguson case, where apart from the ill-recorded physical evidence, no record of what happened exists apart from the witnesses reports.

Using a mathematical analysis of the witness statements, we can see how the variations in what the witnesses saw or recollected are shaped by underlying, and perhaps unconscious, politically polarized psychology. This psychology is indicative of the way in which individuals under stress process information, and how biases color their recounting of events.

Shortly after the Ferguson decision was announced, the prosecutor released much of the documentation generated during the case, including a large number of witness statements.  PBS.org collated many of those statements, noting for dozens of different witnesses which of them affirmed, denied, or did not mention eight different key disputed events in the shooting of Michael Brown by officer Darren Wilson.  The table below shows an excerpt of that collation:

Witness interviews from the Ferguson shooting collated by PBS.org Data, Table: PBS.org

These people all (purportedly) saw the same events with their own eyes, and they mainly derive from a shared community with more homogenous political views than the polarized public at large.  Yet they clearly disagree on many of the objective events that occurred.  What’s going on?  Are these variations due to random omissions and the inherently noisy process of recollection and recall?  Or are there more systematic patterns discernable in who saw, affirmed, or denied what?

To discern whether there are any underlying patterns in these yes/no/NA data, we can use a technique called Principal Component Analysis (PCA). This approach just takes a table of 1’s (yes) 0’s (no) and No Answers (coded here as “0.5”, or half-way between 0 and 1) and finds the underlying dimensions that best “explain” – in a statistical sense – the variation in answers across witnesses and claims.  We can then plot the claims and/or the witnesses according to these dimensions, and interpret what, if anything, these revealed latent dimensions mean.

The figure below shows a network plot of the eight key disputed factual claims about the Ferguson shooting, arranged so that their horizontal (x) position matches their first Principal Component dimension, and their vertical (y) position matches the second Principal Component dimension.  Again, these are derived directly from the yes/no data in Figure 1 by a machine that knows nothing of what they or the witnesses mean.  The claims are drawn with lines linking them, where green lines link those claims that are positively correlated (i.e., tended to be affirmed in tandem by the same witnesses) and red lines are those that are negatively correlated.

Plot of disputed events in the Ferguson shooting based on PCA analysis of the witness statements
Data: PBS.org, Figure: Nick Beauchamp

What pattern can we see in this graph?  Remarkably, the variation in what witnesses affirmed or denied maps almost perfectly onto two of the main issues that have polarized the wider population in discussing Ferguson.  Bringing our human (non-machine) knowledge to bear now, we can color code the EIGHT claims according to which ones support Brown’s side of the dispute, and which ones support Wilson’s side.  Those in blue are pro-Brown (e.g., his hands were up or he was shot at while running away), while those in red are pro-Wilson (e.g., Brown charged at Wilson or had his hand at his waist indicating a possible gun).

In addition, looking at the green lines, we can see two clusters of positively correlated claims: those that strongly support Brown (on the right of Figure 2) and those that strongly support Wilson (on the left).  We would certainly not expect to see these two clusters if the variation in who saw what was merely random; clearly whether they are intending it or not, the witnesses divide into two camps, albeit on a more continuous spectrum than the nation at large.

So what’s going on in the second latent dimension, the vertical y axis?  I have drawn boxes around those claims that were mentioned (affirmed or denied) by more than half of the witnesses.  Those claims also turn out to be the claims that were affirmed by the highest percentage of those witnesses who mentioned them at all in their statements.  Returning to the fallible matter of human interpretation, one might argue that these boxed claims are the ones that, based only on the statement data, are most likely to have been affirmed without solicitation, and may be more likely to be true.  Note in the figure that the claims most affirmed (toward the top) tend to be those which are not in the (green-line) pro-Brown or pro-Wilson clusters, perhaps implying that the denser positive (green) connections between those claims (hand_at_waist/charged_at_DW on one side, shot_while_down/shot_kneeling on the other) may reflect the greater degree of confabulation in those especially partisan claims.

But again, that is merely interpretation; all the data themselves show is that first dimension (x) tends to reflect the pro-Brown/pro-Wilson spectrum, and the second (y) tends to reflect how broadly mentioned and affirmed a claim was.  What that means about the witnesses or the psychological process of observation or recollection is up to us.

Finally, we can do the exact same PCA procedure, but looking at the variation in the witnesses rather than in their claims.  The two latent dimensions are mathematically the same, but this time we plot the witnesses, according to where they lie on the pro-Brown/pro-Wilson spectrum (x) and how widely affirmed by other witnesses their claims were (y).

Figure 3: Plot of witnesses in the Ferguson shooting in the same space as Figure 2
Data: PBS.org, Figure: Nick Beauchamp

Figure 3 shows the result of this plot.  Note the location of Wilson (DW): far to the left (the pro-DW side, as in Figure 2), and somewhat to the bottom (the less-well-affirmed events).  Again, the implication of that y position in particular is up to us. But based on the placements of the other witnesses, for future research it might be worth more closely examining witnesses 10 and 34 (whose stories most resemble DW’s), 45 and 46 (whose stories most support Brown), and particularly witnesses 12 and 14, whose stories most precisely contain only those details most mentioned and affirmed by the other witnesses.  In this way we could perhaps verify (a) whether the claims in the upper direction are actually more likely to actually be true, and (b) what the psychological mechanisms are that shape this politicized Rashomon-like variation in the perceptions or recall of factual events.