Computer translations help doctors overcome Afghan language barrier

ARL used statistical machine translation to collect more than 6,000 medical phrases in English and Dari, which eases the burden of relying on a limited number of human translators.

Forward Operating Base Lightning

The English/Dari critical care manuals help with treatments at outposts such as Forward Operating Base Lightning near Gardez, Afghanistan.


Military medical personnel have teamed with computer scientists to use statistical machine translation to get around the language barrier in Afghanistan and elsewhere and improve care.

Doctors in foreign settings can rely on human translators, but the number of bilingual people with medical knowledge can scarce, particularly in places where the native language isn’t common elsewhere.

Navy Cmdr. Kurt Henry ran into just that problem seven years ago in Kabul, Afghanistan, where he was leading a medical training team. He saw cases of treatable intestinal tuberculosis but ran into an information wall of sorts—the regional hospital lacked medical manuals for newly assigned doctors, and the information he could dig up on the Internet was in English but his team all spoke Dari, the native Afghan language.

Now, thanks to leaps in computer translation technology and efforts by the Army Research Laboratory, doctors have something to work with—bilingual manuals that reference, with explanations of what can be complex information, more than 6,000 Dari medical phrases, according to an Army release. (Statistical machine translation is a method that learns how to translate based on a large body of existing human translations, and is usually based or phrases, as opposed to rules-based machine translation, which focuses on words.)

The  initial project delivered 500 printed English/Dari special-edition manuals for trainers to hospitals and clinics in Afghanistan by 2012. Since then, other manuals have been produced and more, including one that performs what the Army is calling a “priority translation” is on the way. And significantly, ARL has followed up with other computer-based products, including an Android "Army Phrase Book" app, to make access to the phrases easier.

The app not only makes the medical information more widely available in a language people can understand, it also makes that information easier to retrieve even for human translators. Before, translators might have had to speak into a recorder for an hour to get a little bit of information, said Steve LaRocca, computer scientist and team chief at ARL. "The challenge was working with a limited pool of potential translators who were familiar with Dari, a less commonly taught language; and who also understood medical jargon," LaRocca said.

"Computers could never replace the human translator,” said Melissa Holland, chief for ARL's multilingual computing research program, “but we look for ways to relieve some of the burden, especially in less commonly used languages, like Dari, Pashto and Serbian."

Another advantage of the Dari and other data sets being collected is that they create an accessible, searchable trove of information that wasn’t available before. "We've had people translating every day in Korea since about 1951, but we didn't save the data sets over those decades. The knowledge generated by all those people over all those years is gone," said LaRocca, a former language professor at West Point specializing in speech recognition technology and the founding director of the Center for Technology Enhanced Language Learning. "If we had the presence of mind to curate that data or prepare it for the eventual use of technology, we would be so much better off in that language and many others."

Going forward, that kind of medical information will be available in a growing number of languages, since the lab is curating data electronically as it collects it. "Although the [printed] manual may be worn in 10 years, the data sets captured from the translations will live on and be valuable for decades to come," LaRocca said.