1 / 23

Word Sense Disambiguation for Machine Translation

Word Sense Disambiguation for Machine Translation. Han-Bin Chen 2010.11.24. Reference Paper. Cabezas and Resnik . 2005. Using WSD Techniques for Lexical Selection . (Technical report) Carpuat and Wu. 2005. Word Sense Disambiguation vs. Statistical Machine Translation . (ACL 2005)

moesha
Download Presentation

Word Sense Disambiguation for Machine Translation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Word Sense Disambiguation for Machine Translation Han-Bin Chen 2010.11.24

  2. Reference Paper • Cabezas and Resnik. 2005. Using WSD Techniques for Lexical Selection. (Technical report) • Carpuat and Wu. 2005. Word Sense Disambiguation vs. Statistical Machine Translation. (ACL 2005) • Carpuat and Wu. 2005. Improving Statistical Machine Translation using Word Sense Disambiguation. (EMNLP 2007) • Chan et al. 2007. Word Sense Disambiguation Improves Statistical Machine Translation. (ACL 2007) • Apidianaki. 2009. Data-driven semantic analysis for multilingual WSD. (EACL 2009)

  3. SMT Workflow Bilingual Corpus Monolingual Corpus Translation model Reordering model Language model Decoder Input: source language Output: target language

  4. MT Research Areas Bilingual Corpus Monolingual Corpus Word Alignment Translation model Reordering model Language model Decoder Input: source language Output: target language Evaluation Metric

  5. Translation Model (TM) • Research in TM • Phrase extraction • Phrase filtering • Phrase augmentation • Word Sense Disambiguation (WSD)

  6. Traditional WSD • Target word is a single content word • Noun, verb, adjectives • Classification task with predefined senses • WordNet, HowNet • Modern WSD system • Not limited to local context • Linguistic information • Position-sensitive • Syntactic • Collocation • A intuitive application of WSD is SMT

  7. WSD in MT • Wrong translations from Google Translate • what is today's special ? • 什 麼 是 今 天 的 特 色? • I would like to reserve a table for three • 我想保留一表三 • the plane will briefly stop over in the airport • 這架飛機將簡要地停留在機場

  8. WSD in MT: Early Stage • Whether WSD model can help SMT • Energetically debated question over the past years • Implicit WSD in SMT • Local context: phrase table & language model • Dedicated WSD system • Wider variety of context features • Position, sentence-level, document-level features • WSD should play a role in MT • Publicly available SMT system • Pharaoh by Philipp Koehn (2003~2004)

  9. Small Scale Experiment (1) • Marine CARPUAT and Dekai Wu, 2005 • Chinese-to-English translation task • Chinese lexical sample task includes 20 target • Trained with state-of-the-art WSD • 37 training instances per target word (manual annotation)

  10. Small Scale Experiment (2) • Hard decision • Force the decoder to choose translations from glosses • Decided by language model • Surprising and frustrating result • Small data, out-of-domain material, hard decision • Language model effect

  11. Translation Disambiguation (1) • Clara Cabezas and Philip Resnik, 2005 • Address 3 problems of the previous work • Use aligned target word directly as "sense" • 4 senses for "briefly": {短暫地, 短時間地, 簡潔地, 簡要地} • Trained with state-of-the-art WSD • Handle "small data" and "out-of-domain" problems • Soft decision • Pharoah XML markup • Choose specified translations and translation model together • Handle "hard decision" problem

  12. Translation Disambiguation (2) • Pharaoh XML markup • Experiment & Result • Spanish-to-English test from Europarl test • WSD: 0.2382, Baseline: 0.2356 • Not statistically significant • But at least it is not a decrease

  13. Toward Better Integration into SMT • How to better integrate WSD into SMT? • Phrase-based sense disambiguation (PSD) • Key points • Phrase, not word • Integration into log-linear model: weight tuning

  14. Successful Integration (1) • Chan et al., 2007 • Chinese-to-English translation • Sense disambiguation on Chinese phrase • 1 or 2 consecutive Chinese words • Extract training examples from word-aligned corpus • Add WSD features • Contextual probability of WSD • Reward probability of WSD

  15. Successful Integration (2) • Statistically significant improvement • 將 無法 取得 更 多 援助 或 其他 讓步 • Hiero: will be more aid and other concessions • Hiero+WSD: will be unable to obtain more aid and other concessions

  16. PSD System (1) • Marine CARPUAT and Dekai Wu, 2007 • WSD model for every phrase • Extract training data from phrase extraction • WSD probability as new feature • Comments • Not every phrase need WSD • Technical problem (Pharaoh)

  17. PSD System (2) • Result: better translation on all test sets IWSLT 2006 dataset NIST 2004 test set

  18. PSD System (3)

  19. Recent Issue • Different translations may have the same sense • 2 senses for "briefly", rather than 4 • Sense 1: {短暫地, 短時間地} • Sense 2: {簡潔地, 簡要地} • Automatic sense clustering

  20. Sense Clustering (1) • Marianna Apidianaki, 2009 • Two translations are semantically related • If they occur in similar context • Translation unit (TU) as context • Bilingual sentence pair • Source word "briefly" • Translations • {短暫地, 短時間地, 簡潔地, 簡要地} • {t1, t2, t3, t4}

  21. Sense Clustering (2) • "briefly-t1" occurs in context {TU1, TU4, TU25, TU88…} • "briefly-t2" occurs in context {TU5, TU18, TU92, TU126…} • Clustering based on pairwise context similarity • Apidianaki, 2008

  22. Sense Clustering (3) • Experiment • English-Greek translation • 150 ambiguous English nouns • Evaluation of lexical selection • Strict precision (Exact match with answer word) • Enriched precision (Match with the cluster of answer word) • Result

  23. Conclusion • From WSD to PSD • However, semantic is also important • Future work • Semantic PSD

More Related