Translation is relatively straightforward when it comes to turning a French novel into an English one. But the arrival of social media, the ability to collect far more data and a focus on more unfamiliar languages has dramatically changed the government’s translation needs.
Take Arabizi, an alphabet developed to communicate in Arabic over the Internet or by text when Arabic characters aren’t available. The format can be found on Twitter or discussion boards.
For governments and others trying to figure out exactly what users are saying, whether from intercepted messages or public forums, the new and inventive uses of language can flummox traditional translation models.
That has created an opportunity for businesses such as Cambridge, Mass.-based Basis Technology, a text analysis software company with operations in the Washington area.
Joel Ross, a former CIA employee, left the agency and came to Basis in 2002 to start the company’s federal group. He had spent two decades at the CIA, working at one time as a Middle East analyst.
Ross started the unit in his basement, working there for about three years while he waited for business to blossom. In 2005, the unit made its first hire — in addition to Ross — and gained funding from In-Q-Tel, an investment firm established by the CIA to identify innovative technology that might be useful to the intelligence community.
It was not long before the company began to pick up steam, developing products it thought might be useful to the intelligence community — mostly based on Ross’s knowledge of the CIA and its needs.
“There’s always been foreign language tools for English and French and Spanish, but it seemed like the ones that the government and military were really interested [in] were these languages that most people have never heard of, like Urdu and Pashto and Dari,” Ross said. “There were no commercial products because there is no commercial business for that.”
Today, government revenue makes up about 50 percent of Basis’s sales, and the company has a Herndon office. Its flagship product, called Rosette, is a language analysis suite that essentially can search millions of documents for a particular language and particular words.
Even as the government moves out of Iraq and Afghanistan, the company is expecting the need for language analysis to grow.
Other companies are also banking on this expansion. McLean-based Science Applications International Corp., for instance, is growing its linguistic services work. In 2010, the company acquired assets and intellectual property from three language technology firms: AppTek Partners, Applications Technology and MediaMind.
Jeffrey Heisman, chief operating officer of SAIC’s human language technology and mission support group, said the technology can boost productivity for both government and commercial organizations.
“It’s not people versus technology,” he said. “It’s really kind of bringing them together to provide value-added services.”
Ross, too, said Basis has no intention of replacing the linguists and translators the government employs; its pitch, rather, is that it can help those people be more productive by enabling technology to help.
“We need to get more out of the people we have,” he said.