1 / 25

Stochastic Inversion Transduction Grammars Dekai Wu

Stochastic Inversion Transduction Grammars Dekai Wu. 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006. Overview. Simple Transduction Grammars Inversion Transduction Grammars (ITGs) Stochastic ITGs Parsing with SITGs Applications of SITGs

bryant
Download Presentation

Stochastic Inversion Transduction Grammars Dekai Wu

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006

  2. Overview • Simple Transduction Grammars • Inversion Transduction Grammars (ITGs) • Stochastic ITGs • Parsing with SITGs • Applications of SITGs • Main Reading: Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora (1997)

  3. Introduction • Mathematical models of translation • IBM Models (Brown et al.): String generates String • Syntax based (Yamada & Kenji): Tree generates String • ITG (Wu): two trees are generated simultaneously • ITGs • A formalism for modeling bilingual sentence pairs • Not intended to use as full translation models, but to use for parallel corpus analysis • Extract useful structures from input data • Generative view rather than translation view • two output trees are generated simultaneously, one for each language

  4. Transduction Grammars • A simple transduction grammar is a CFG whose terminals are pairs of symbols (or singletons) • Can be used to model the generation of bilingual sentence pairs E The Financial Secretary and I will be accountable. C

  5. Transduction Grammar Rules E.g. • Simple Rules: • Inversion Rule:

  6. Transduction Grammars • A simple transduction grammar is a CFG whose terminals are pairs of symbols (or singletons) • Can be used to model the generation of bilingual sentence pairs E C

  7. Transduction Grammars • In general, they are not very useful • two languages should share exactly the same grammatical structure • So some sentence pairs cannot be generated • ITG removes the rigid parallel ordering constraint • Constituent order in one language may be the inverse of the other language • Order is the same for both (square brackets): • Order is inverted for one (angle brackets):

  8. ITGs • e.g. • With ITG we can parse the previous sentence pair • Inversion rule: VP   VV PP 

  9. ITG Parse Tree

  10. Expressiveness of ITGs

  11. Expressiveness of ITGs • Not all matching are possible with ITG • e.g. ‘Inside-out’ matching are not allowed • This helps to reduce the combinatorial growth of matchings with the number of tokens • The number of matchings eliminated increases rapidly as the number of tokens increases • Author claims this is a benefit

  12. Expressiveness of ITGs

  13. Normal Form of ITG • For any ITG there exists an equivalent grammar in the normal form • Right hand side of all rules have either: • Terminal couples • Terminal singletons • Pairs of non-terminals with straight orientation • Pairs of non-terminals with inverted orientation

  14. Stochastic ITGs • A probability can be assigned to each rewrite rule • The probabilities of all the rules with a given left hand side must sum to 1. • An SITG will give the most probable matching (ML) parse for a sentence pair. • Similar to Viterbi or CYK (Chart) parsing

  15. Parsing with SITGs • Every node (q) in the parse tree has 5 elements: • Begin & end indices for language-1 string (s,t) • Begin & end indices for language-2 string (u,v) • Non-terminal category (i) • Each cell (in the chart) stores the probability of the most likely parse covering the appropriate substrings, rooted in the appropriate category

  16. Parsing with SITGs - Algorithm • Initialize the cells corresponding to terminals using a translation lexicon • For the other cells, recursively find the most probable way of obtaining that nonterminal category. • Compute the probability by multiplying the probability of the rule by the probabilities of both the constituents • Store that probability plus the orientation of the rule • Complexity: O(n3m3)

  17. Applications of SITGs • Segmentation • Bracketing • Alignment • Bilingual Constraint Transfer • Mining parallel sentences from comparable corpora [Wu & Fung 2005]

  18. Applications of SITGs - Segmentation • Word boundaries are not marked in Chinese text • No word chunks available for matching • One option : do word segmentation as preprocessing • Might produce chunks with that does not agree bilingually • Solution: extend the algorithm to accommodate segmentation • Allow the initialization step to find strings of any length in the translation lexicon • The recursive step stores the most probable way of creating a constituent, whether it came from the lexicon or from rules

  19. Applications of SITGs – Bracketing • How to assign structure to a sentence with no grammar available? • Especially problematic for minority language • A solution using ITGs: • Get a parallel corpus pairing it with some other language • Get a reasonable translation dictionary • Parse it with a bracketing transduction grammar

  20. Bracketing Transduction Grammar • A minimal ITG • Only one nonterminal: A • Production rules: • Lexical translation probabilities has prominence • Small prob. values for the two singleton production rules • Also, a very small value for

  21. Bracketing with Singletons • Singletons cause bracketing errors • Some refinements: • Depending on the language, bias the singletons attachment either to the left or the right of a constituent • Apply a series of transformations which would push the singletons as closely as possible towards couples e.g. [ xA B  ] ⇌xA B ⇌ x A  B⇌  [x A ] B • Before: • After:

  22. Bracketing Experiments • Used 2000 Chinese-English sentence-pairs from HKUST corpus • Some filtering: • Remove sentence pairs that were not adequately covered by the lexicon (>1 unknown words) • Remove sentence pairs with high unmatched words (>2) • Bracketing precision: • 80% for English • 78% for Chinese • Errors mainly due to lexical imperfections • A statistical lexicon (~6.5k English, ~5.5k Chinese words) • Can be improved with extra information • e.g. POS, grammar-based bracketer

  23. Applications of SITGs - Alignment • Alignments (phrasal or word) are a natural byproduct of bilingual parsing • Unlike ‘parse-parse-match’ methods, this • Doesn’t require a robust grammar for both languages • Guarantees compatibility between parses • Has a principled way of choosing between possible alignments • Provides a more reasonable ‘distortion penalty’ • Recent empirical studies show ITGs produce better alignments in various applications [Wu & Fung 2005]

  24. Bilingual Constraint Transfer • A high-quality parse for one language can be leveraged to get structure for the other • Alter the parsing algorithm: • only allow constituents that match the parse that already exists for the well-studied language • This works for any sort of constraint supplied for the well-studied language

  25. References: • Dekai Wu (1997), Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora, Computational Linguistics, Vol. 23, no. 1, pp. 377-403. • Dekai Wu (1995), Grammarless Extraction of Phrasal Translation Examples from Parallel Texts, 6th Intl. Conf.on Theoretical and Methodological Issues in Machine Translation, Vol. 2, pp. 354-372. Leuven, Belgium. • Dekai Wu and Pascale FUNG (2005), Inversion Transduction Grammar Constraints for Mining Parallel Sentences from Quasi-Comparable Corpora, 2nd Intl. Joint Conf. on Natural Language Processing (IJCNLP-2005), Jeju, Korea, October.

More Related