80 likes | 95 Views
The U.S. government requires language technology to support over 30,000 language professionals at various levels, handle extensive foreign language materials, enable collaboration, and assist in disaster relief efforts. This article provides a list of languages being reviewed, along with the need for various language technology tools and support.
E N D
U.S. Government Language Requirements 7 September 2000 Everette Jordan Department of Defense AJjord@fggm.osis.gov (301) 688-7198
Need for Language Technology • Over 30,000 language professionals in the U.S. government • Many more at state and local levels • Extensive material in foreign language, often in legacy formats/encodings • More than 50 percent of Library of Congress is non-English • Many kinds of applications • Assimilation and dissemination • Extensive collaboration • Need for large number of languages • Disaster relief (e.g., Haiti) unpredicatable • Extensive multinational efforts • Language list now being reviewed • Short list as follows:
Afrikaans Albanian Amharic Arabic Armenian Ayamara Azerbaijani Bangla Basque Belarusian Bengali Bosnian Bulgarian Burmese Cantonese Catalan Chinese Croatian Czech Danish Dari Dutch English Estonian Farsi Finnish French Georgian German Greek Guarani Haitian Creole Hausa Hebrew Hindi Hungarian Icelandic Indonesian Italian Japanese U.S. Government Languages(List being updated)
Kazakh Khmer/Cambodian Kinyarwanda Kirundi Korean Kurdish Lao Latvian Lithuanian Macedonian Mongolian Nepali Norwegian Pashto Polish Portuguese Romanian Russian Serbian Sinhalese Slovak Slovenian Spanish Swahili Swedish Tagalog Thai Tibetan Tigrigna Turkish Ukranian Urdu Uzbek Vietnamese Languages (Continued)
Analyst Today Translators and limited Machine Translation
Integrated Collaborative Translation Space with Shared Tools Archives Expert Identify language, content, and importance Fast Routing Translated and/or Tagged
Browsers Text Processors Web Page Tools OCR MT Search Engines Translation Managers Language Learning Dictionaries Thesauri Developers Kits Info Extraction & Summarization Knowledge Management Visualization Other Types of Technology Needed
Unicode UTF 8 Other major code sets Code set conversions (extensible) Language and encoding ID Mixed languages Per page Per database English interfaces English sys admin Work with Microsoft and/or Sun (non-localized) Work well with other COTS applications Easy training Good U.S. support Comply with W3C guidelines for accessibility Enable easy extensibility by government Special Requirements