We are grateful to Eugene Volokh for the invitation to discuss corpus linguistics generally and our forthcoming article, “Judging Ordinary Meaning,” in particular.

Corpus linguistics is an empirical approach to the study of language that involves large, electronic collections of texts known as corpora (the plural of corpus). Corpus linguists draw inferences about language from data gleaned from real-world language in its natural habitat — in books, magazines, newspapers and even transcripts of spoken language. Through corpus analysis we can test our hypotheses about language through rigorous experimentation with observable and quantifiable data and arrive at results that are replicable and falsifiable.

In our article we rely on the (comparatively) new tool of corpus linguistics to examine a very old problem — how to discover the “ordinary meaning” of a legal text.

When called upon to interpret legal texts, judges often invoke the so-called ordinary meaning rule — a familiar canon of interpretation that states that if the language of the text is clear and unambiguous, courts cannot consider any extrinsic evidence to decide what the text means. There are some very good reasons why courts follow the ordinary meaning rule (many of which are outlined in the opening chapter of William Eskridge Jr.’s recent book, “Interpreting Law“). These include a respect for the rule of law that enshrines the predictable, neutral and objective application of the laws enacted by Congress and a respect for the constitutional processes by which those laws are enacted.

But the ordinary meaning rule also has a number of problems, both in the way that ordinary meaning is theorized and the way that it is operationalized (or measured).

With respect to theory, it is ironic that judges have no agreed-upon notion of what the phrase “ordinary meaning” actually means. The case law embraces a startlingly broad range of senses of ordinary meaning (which we examine in depth in the article). Judges sometimes speak of ordinary meaning in terms of what senses of a word are possible in a given context, while at other times they speak in terms of which of two competing senses is the most common. Courts often fail to take into account a variety of questions that have bearing on the meaning of an utterance, including its context, historic usage and the speech community in which it was uttered. Even if passing reference is made to any of these considerations in a given case, courts lack a systematic framework for addressing them.

Turning to operational problems, the approach to ordinary meaning taken by many courts relies very heavily on intuition and dictionaries. But human linguistic intuition is, at best, a problematic guide to the predictable and objective resolution of questions of ordinary meaning and general-use dictionaries, of the kind most often cited by courts, typically set forth a range of possible meanings of a given word but cannot be relied upon to show the ordinary meaning of a given word in the particular context of a statute.

We argue that a complete theory of ordinary meaning requires us to take into account not only the comparative frequency of different senses, but also the context of an utterance, its historical usage and the speech community in which it was uttered. Context necessarily includes the formal aspects of an utterance, its syntactic structure and semantic features, as well as the pragmatic aspects of the utterance, including the physical, spatial and social environment in which it occurs. Ordinary meaning should also take into account historical usage, acknowledging the simple fact that language is in a constant state of change (but does not change at a predictable rate). Ordinary meaning should also take into account variations in meaning in the speech or writing of different speech communities and different linguistic registers.

If we are going to incorporate questions of comparative frequency; syntactic, semantic and pragmatic context; historical usage, and speech community into our analysis of ordinary meaning, then we need some way to sample and measure these things. That is where linguistic corpora come in.

Using the tools of a linguistic corpus, we can measure the comparative frequency of a given sense of a given word in given context. We can design a corpus search that takes into account the syntactic and semantic context of the word or phrase in question. We can search for sample sentences that share similar pragmatic contexts with the text under examination. We can create linguistic corpora to model the speech or writing of a wide variety of speech communities and registers, and we can build corpora from the surviving texts from any period in history.

By incorporating corpus methods into the search for ordinary meaning, we can turn a largely intuitive and opaque inquiry into an empirical and transparent one.

Writing in 2011, language commentator Ben Zimmer stated (with some qualification) that “the corpus revolution promises to put judicial inquiries into language patterns on a firmer, more systematic footing.” We agree. And over the course of the coming week, we look forward to outlining the promise of corpus linguistics for questions of legal interpretation, as well as some important limitations.