Libby’s quandary will come as no surprise to anyone who has tried to use a computer to translate things. For decades, machine translation was mostly useful if you were trying to be funny. But in the last few years, as anyone using Google Translate, Babel Fish or many other translation Web sites can tell you, things have changed dramatically. And all because of an effort begun in the 1980s to remove humans from the equation.
As the late Frederick Jelinek, who pioneered work on speech recognition at IBM in the 1970s, is widely quoted as saying: “Every time I fire a linguist, my translation improves.” (He later denied putting it so harshly.)
Up to that point, researchers working on machine translation used linguistic models. By getting a computer to understand how a sentence worked grammatically in one language, the thought was, it would be possible to create a sentence meaning the same thing in another language. But the differing rules in different languages made it difficult.
Jelinek and his group at IBM argued that by using statistics and probability theory, instead of language rules, a computer could do a better job of converting one language into another. Translation, they basically argued, was as much a mathematical problem as a linguistic one.
The computer wouldn’t understand the meaning of what it was translating, but by creating a huge database of words and sentences in different languages, the computer could be programmed to find the most common sentence constructions and alignment of words, and how these were likely to correspond between languages. (Warren Weaver, a mathematician at the Rockefeller Foundation, had first raised the idea of a statistical model for translation in a 1947 letter in which he wrote: “When I look at an article in Russian, I say: ‘This is really written in English, but it has been coded in some strange symbols.’ ”)
The IBM effort began with proceedings from the Canadian parliament, which were published in English and French. “A couple guys drove to Canada and left with two suitcases full of tapes that contained the proceedings,” says Daniel Marcu, co-founder of Language Weaver, the first start-up to use the new statistical techniques in 2002.
Jelinek’s group began by using a computer to automatically align sentences in the French and English versions of the parliamentary documents. It did this by pairing sentences from the same point in the proceedings that were of roughly equal lengths. If an opening sentence in English was 20 words long but the French opening was two sentences of about 10 words, the computer would pair the English sentence with the two French ones. The IBM researchers then used statistical methods and deductions to identify sentence structures and groups of words that were most common in the paired sentences.