270 likes | 286 Views
High-quality Speech Translation for Language Learning. Chao Wang and Stephanie Seneff June 24, 2004 Spoken Language Systems Group MIT Computer Science and Artificial Intelligence Lab. Outline. Motivation and introduction Component technologies Language understanding Language generation
E N D
High-quality Speech Translation for Language Learning Chao Wang and Stephanie Seneff June 24, 2004 Spoken Language Systems Group MIT Computer Science and Artificial Intelligence Lab
Outline • Motivation and introduction • Component technologies • Language understanding • Language generation • Translation by generation • Translation by example • Evaluation • Summary and future work
Background • Language teachers have limited time to interact with students in dialogue exchanges • Computers can provide non-threatening environment in which to practice communicating • Our group has been developing multi-lingual spoken conversational systems since 1990 • Concentrating on domains related to travel • Can easily be adapted for language learning applications • A translation capability from the native language (L1) to the target language (L2) can greatly improve their usability for language learning
Introduction • Goal: provide translation aids for language learning • Must be high quality • Must be robust to speech recognition errors • Strategies for achieving high quality and robustness • Interlingua-based translation using formal generation rules • Restricted conversational domains (lesson plans) • Emphasis on mechanisms to enable rapid porting to new domains and languages • Use parsability to assess quality of translation outputs • Back off to example-based method when parse fails
Language Understanding: TINA Approach: Context free rules + constraints + probabilities Rules: • Define permissible linguistic patterns in the language and domain • Encode both syntactic and semantic information Constraints: • Eliminate patterns that violate known syntactic/semantic restrictions (e.g., number agreement) • Account for movement of constituents in surface realization Probabilities: • Support prediction of next word given preceding context TINA has been used in many systems over the last 10 years: • Domains: weather, air travel, restaurant guide, hotel reservations, urban navigation, . . . • Languages: English, Mandarin, Japanese, Spanish, French. . . .
Pegasus Orion Jupiter Mercury Voyager “Scrubbed” sentences Generic Grammar Domain dependent semantics Grammar for New Domain Process to Automate Grammar Development • Merge several grammars into shared rules, predominantly syntax-based • Once generic grammar is available, creating derivative domain-dependent grammars is straightforward Merged “Seed” Grammar “Are there any <noun> from <proper_name> to <proper_name>”
sentence question sentence will subject predicate question intr_verb_phrase will subject predicate intr_verb intr_verb_args intr_verb_phrase locative temporal intr_verb intr_verb_args in a_city day_list locative temporal city_name this weekend in a_city day_list will it rain in boston this weekend city_name this weekend will it rain in boston this weekend Example Parse Tree • Utilizes pre-existing sub-grammars for time and location • Selected parse categories contribute to a hierarchical semantic frame (interlingua)
Will it rain in boston this weekend? {c verify :aux “will” :subject “it” :pred {p rain :pred {p locative :prep ‘in” :topic {q city :name “boston” } } :pred {p temporal :topic {q weekday :quantifier “this” :name “weekend” } } } } Semantic Frame for Example Semantic frame encodes syntactic structure and features in addition to semantic information
Language Generation: GENESIS • Generates a surface string from the semantic frame • Accomplishes many tasks in dialogue system development • In the same language (paraphrasing & response generation) • In a different language (translation) • Other formal languages (key-value pairs, SQL queries, etc.) • Utilizes recursive formal rules along with a lexicon encoding appropriate surface form realizations in context
Challenges in Cross-languageGeneration for Translation • Some expressions have very different syntactic structures in different languages What is your name? 你(you) 叫(call) 什么(what) 名字(name)? I like her. Ella me gusta. • Syntactic features are expressed in many different ways • Determiners (English but not Chinese) 附近(vicinity) 哪儿(where) 有(have) 银行(bank)? Where is a bank nearby? • Particles (Chinese but not English) that hotel 那(that)家(<particle>) 旅馆(hotel) I lost my key.我(I) 丢(lose) 了(<past tense>)我的(my) 钥匙(key). • Gender (extensive in Spanish)
Generation Procedures • Constituent order specified in recursive rules • “Pull” and “Push” mechanisms support major structural reorganization • Lexical selection controlled by feature propagation • Inflectional forms based on syntactic features • Lexical realization (word sense) influenced by surrounding semantic context • Infers missing features • Can generate multiple surface strings for the same semantic frame
“will” conditioned by “verify” pulled to the front bo1 shi4 dun4 zhe4 zhou4 mo4hui4 bu2 hui4xia4 yu3? ( Boston this weekendwill-not-willrain ? ) zhe4 zhou4 mo4 bo1 shi4 dun4 hui4 xia4 yu3ma5 ? (this weekend Boston willrain <question-particle> ? ) A Generation Example {c verify :aux “will” :subject “it” :pred {p rain :pred {p locative :prep ‘in” :topic {q city :name “boston” } } :pred {p temporal :topic {q weekday :quanitifier “this” :name “weekend” } }} }
Example-based Translation rejected Semantic Frame English Input Chinese Sentence Chinese Output Parse? Parse Generate accepted English Grammar Chinese Grammar Chinese Rules Generation-based Translation • Semantic frame serves as interlingua • Translation achieved by parsing and generation • Use Chinese grammar to detect potential problems • Rejected sentences routed to example-based translation for a second chance
Example-based Translation • Requires translation pairs and a retrieval mechanism • Corpus automatically obtained via the generation-based approach • Retrieval based on lean semantic information • Encoded as key-value pairs • Obtained from semantic frame via simple generation rules • Generalizes words to classes (e.g., city name, weekday, etc.) to overcome data sparseness
Parser Semantic Frame KV String English Input Chinese Output English Grammar Generator Key-value Rules { <CITY> : San Francisco } <CITY> <CITY> hui4 bu2 hui4 xia4 yu3? Example-based Translation Procedure KV-Chinese Table Is there any chance of rain in San Francisco? WEATHER: rain CITY: San Francisco { <CITY> : jiu4 jin1 shan1 } jiu4 jin1 shan1 • Key-value string serves as interlingua • Translation achieved by parsing and table lookup • City name masked during retrieval and recovered in final surface string
Key-value Index Database Key-value Index Database Chinese Sentence translation Semantic Frame no yes Key-value Rules Generate English Input Parses? Parse Chinese Rules Chinese Grammar English Grammar will it rain in Boston tomorrow? bo1 shi4 dun4 ming2 tian1 hui4 xia4 yu3 ma5? <CITY> indexing WEATHER: rain CITY: boston <CITY> Complete Translation Procedure Retrieval Creation • Only parsed sentences go into key-value database • Indexed by semantic information encoded as key-value string • Unnparsed translations replaced by key-value option • Use word classes to overcome data sparseness yes
Evaluation: English to MandarinWeather Domain • Evaluation data • Drawn from the publicly available Jupiter weather system • Telephone recordings; conversational speech • Unparsable utterances (English grammar) were excluded • Total of 695 utterances, with 6.5 words per utterance on average • System configuration • Text input or speech input • Recognizer achieved 6.9% word error rate, and 19.0% sentence error rate • Generation-based method preferred over example-based method • NULL output if both failed • Evaluation criteria • Yield of each translation method • Human judgment of translation quality
Evaluation Results (I) • Majority of the utterances are successfully translated using formal generation rules, which are likely to achieve high fidelity and quality • A greater percentage of the utterances fail in the speech mode, due to recognition errors • System will apologize for not understanding the utterance and invite the user to try again
Evaluation Results (II) • Human judgment of translation quality based on grammaticality and fidelity • Three categories: perfect, acceptable, or wrong • Fewer than 2% of the utterances produce incorrect translation outputs • A concurrent English paraphrase provides context for the Chinese translation
Summary and Future Work • We have demonstrated a capability to produce high-quality spoken-language translations from English to Mandarin • Evaluation restricted to weather domain • Fewer than 2% of the translations were incorrect Future Plans: • Integrate into spoken dialogue systems • Incorporate framework into classroom environment • Assess effectiveness in second-language acquisition • Port to other domains and languages • Develop tools to enable rapid porting
KV-Chinese Table KV String Key-value Rules Semantic Frame English Input Parser Parser Generator English Grammar Chinese Grammar Chinese Rules Chinese Output Chinese Sentence accepted will it rain in Boston tomorrow? bo1 shi4 dun4 ming2 tian1 hui4 xia4 yu3 ma5? <CITY> indexing WEATHER: rain CITY: boston <CITY> Translation Corpus • Guaranteed coverage by the Chinese grammar • Indexed by semantic information encoded as key-value string • Use word classes to overcome data sparseness
Key-value Rules English Input Parses? Parse Generate Chinese Grammar English Grammar Chinese Rules will it rain in Boston tomorrow? bo1 shi4 dun4 ming2 tian1 hui4 xia4 yu3 ma5? <CITY> indexing WEATHER: rain CITY: boston <CITY> Translation Corpus • Guaranteed coverage by the Chinese grammar • Indexed by semantic information encoded as key-value string • Use word classes to overcome data sparseness Key-value Index Database Chinese Sentence Semantic Frame yes
Recognition SUMMIT Models NLU TINA Generation Rules Interlingua NLG Parsing Rules GENESIS Synthesis ENVOICE Speech Corpora Interlingua-based Speech Translation Common meaning representation: semantic frame English Chinese English Chinese
Understanding and Generation:Procedural Strategy • Develop end-to-end English system • Solicit example utterances from SLS members • Create generation rules for Chinese paraphrase • Generated sentences become initial Chinese corpus • Develop understanding component for Chinese input • Map to identical semantic frame as much as possible • Adjust English generation for Chinese inputs • Deal with missing function words, etc. • Translation loop now possible: English Chinese English • Evaluation based on English-to-translated-English • Similar strategy for other languages
Strategies for Translation • Grammar design strategies • Preserve as much information as necessary for accurate translation • Semantic frames are much more detailed than those in human-computer interaction applications • Maintain consistency of semantic frame representation across different languages whenever possible • Seed grammar rules for each new language on English grammar rules • Mapping from parse tree to semantic frame preserved • Remaining language dependent aspects in semantic frame are addressed by generation rules
An Example: English/Chinese How long does it take to take a taxi there • Function words disappear in Chinese How long does it take to take a taxi there How long take take taxi there How long need take taxi there How long need take taxi go there ( take taxi go there need how long ) 坐 出租车 去 那里 要 多久 • Two instances of “take” have different translations • Verb “go” omitted in English • Sentence structure is very different