The following is a guest post from political scientist Nick Beauchamp of Northeastern University. He is a member of the interdisciplinary NULab for Text, Maps and Networks and an affiliate of the NYU Social Media and Political Participation (SMaPP) laboratory.
To illustrate this temporal structure and provide insight into its rhetorical development, we can algorithmically create something like a word cloud, but which also captures in the positioning of the words and the trajectory of the speech through its ideas over time. Figure 1 is what I call a “plot map” of last night’s SOTU address. The speech itself is the blue line wending through space, beginning at 1 and ending at 5. The words are the 20 most frequent words in the speech, not including boring words like “the” and “of.” Words are sized by frequency and colored according to clusters of co-occurring words.
The positioning of the words is not simply aesthetic as in a word cloud. In this case, the words and trajectory are plotted in a shared 2D space using a variant of principal component analysis. The positions of the words reflect the fact that different words can occur together or be opposed to each other: when some are mentioned, others tend not to be, and vice versa. The PCA procedure automatically infers the most dominant axes of opposition, but these oppositions often have easily interpretable meanings. In this case, we see along the horizontal dimension an opposition between broad rhetoric about America and its people on the right vs more policy-oriented discussion of the economy or world events on the left, while the domestic and international are opposed along the vertical dimension.
Within that 2D word map, we can see how the speech itself moves through these topics. It begins with broad discussions of Americans and the specific example of Rebekah; it then segues into a discussion of the economy and the plight of American workers; from there it moves into a discussion of world events; and finally it returns to the grand themes of America and its people. While this return to its beginnings might seem like a natural trajectory for most speeches, it is actually somewhat rare for a speech to manage such a neat circular trajectory: such a circular return requires a carefully constructed structure that revisits its beginnings on a number of different dimensions. As listeners to the speech might have noticed, a relatively large amount of time was spent at the end revisiting the subjects of America and cooperation, rather than the more brief or perfunctory return that tends to be more common.
Perhaps most interesting is how different variants of America appear in different places. It is common in text analysis to collapse all variants of a word into a single stem, but this speech illustrates how that can hide important distinctions. “Americans” is who we are as members of this country; “American” is the designator of this country in the world; and “America” is something more than the country, an idea as well as a people, which Obama drove home in his finish. The speech is replete with groups – Congress, families, businesses, workers, people, the world, Americans and Rebekah – but the grand trajectory of the spech shows how all of these are in effect way points building to the final, concluding notion of America.
(You can try the technique out on a favorite speech or your own work at: http://nickbeauchamp.com/projects/plotmapper.php )