Joshua Tucker (JT): What exactly is the field of “text analysis,” and how is it different from simply reading existing texts?
Margaret Roberts (MR): “Text analysis” is revolutionizing social-science research. Automated text analysis allows social scientists to use computational methods to quickly study the content of huge numbers of political documents. Humans write billions of words every day about their social lives and about politics. People share their political opinions in social-media posts, governments record the minutes of meetings and the text of legislation, and newspapers recount political events in daily publications.
We are so prolific that social scientists could never read every document that contained information about their topic of study — doing so would take lifetimes of doing nothing other than reading these texts!
Text analysis can also be used to assist reading: It can flag a selection of documents that should be read in more detail, focusing social scientists on important, representative or influential texts.
JT: How are political scientists using text analysis?
MR: Automated text analysis has allowed researchers to analyze political phenomena at a previously impossible scale.
Some of these political data are brand new and some of them have existed for decades, but we are only now unlocking their potential to obtain better descriptions of important phenomena from politics to religion to political opinion.
JT: What are the major challenges facing more widespread adoption of text analysis in research?
MR: Currently, the challenge for text analysis isn’t getting data. Large amounts of text data are being produced and documented online at a rate faster than social scientists can use them.
The most important challenge is being able to estimate concepts that are of interest to social scientists directly from the texts. “Big” data and in particular text data are only as useful as the methods we have to use them to answer questions. Social scientists would like not only to automatically extract measures of topics and sentiment from texts, but also to uncover more complex social phenomena such as persuasion, humor, sarcasm, innovation and influence.
Researchers are making progress developing statistical methods that can summarize the complex social processes that are richly reflected in text, and the authors in our special issue make significant strides toward this goal with the methods that they develop for automated text analysis. Statisticians, social scientists, companies and computer scientists are making these methods available through statistical software so others can use them off the shelf (see some recent software by political scientists).
JT: What was the most interesting thing you learned when putting together this special issue? What are the most exciting questions that the articles in the special issue could be used to answer in the future?
MR: It used to be that we could only study people by meeting them in person, such as through surveys or interviews conducted by painstaking travel, sometimes to remote locations. This severely constrained the types of people we could reach and study.
JT: How long will these articles be available for public access?
MR: The virtual issue will be online and freely available until September 2016.