1 / 28

Statistical Phrase Alignment Model Using Dependency Relation Probability

Statistical Phrase Alignment Model Using Dependency Relation Probability. Toshiaki Nakazawa and Sadao Kurohashi Kyoto University. Outline. Background Tree-based Statistical Phrase Alignment Model Model Training Experiments Conclusions. Conventional Word Sequence Alignment.

shay-young
Download Presentation

Statistical Phrase Alignment Model Using Dependency Relation Probability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical Phrase Alignment ModelUsing Dependency Relation Probability Toshiaki Nakazawa and SadaoKurohashi Kyoto University

  2. Outline • Background • Tree-based Statistical Phrase Alignment Model • Model Training • Experiments • Conclusions

  3. Conventional Word Sequence Alignment 受 (accept) 光 (light) A 素子 (device) photogate に (ni) is は (ha) used フォト (photo) for ゲート (gate) the を (wo) photodetector 用いた (used)

  4. grow-diag-final-and

  5. Conventional Word Sequence Alignment Proposed Model 受 受 (accept) (accept) 光 A 光 (light) (light) A 素子 photogate 素子 (device) (device) photogate に is に (ni) is (ni) は used は (ha) used (ha) フォト for フォト (photo) for (photo) ゲート the ゲート (gate) (gate) the を photodetector を (wo) photodetector (wo) Dependency trees 用いた 用いた (used) (used)

  6. Proposed Model 受 (accept) 光 A (light) 素子 photogate (device) に is (ni) は used (ha) フォト for (photo) ゲート the (gate) を photodetector (wo) Dependency trees Phrase alignment Bi-directional agreement 用いた (used)

  7. grow-diag-final-and Proposed model

  8. Related Work • Using tree structures • [Cherry and Lin, 2003], [Quirk et al., 2005], [Galley et al., 2006], ITG, … • Considering phrase alignment • [Zhang and Vogel, 2005], [Ion et al., 2006], … • Using two directed models simultaneously • [Liang et al., 2006], [Graca et al., 2008], …

  9. Tree-based Statistical Phrase Alignment Model

  10. Dependency Analysis of Sentences Source (Japanese) Target (English) 受 (accept) 光 A (light) 素子 photogate (device) に is (ni) は used (ha) Word order フォト for (photo) ゲート the (gate) を photodetector (wo) 用いた (used) Head node A photogate is used for the photodetector Head node 受光素子にはフォトゲートを用いた

  11. Overview of the Proposed Model(in comparison to the IBM models) • IBM models find the best alignment by • Proposed model : source sentence : target sentence : alignment : Lexical prob. Word translation Word reordering : Alignment prob. Phrase translation Dependency Relation Phrase translation Dependency Relation Phrase translation Dependency Relation

  12. Phrase Translation Probability

  13. Phrase Translation Probability IBM Model • Note that the sentences are not previously segmented into phrases F1 F2 E1 f1 s(j): s(1) = 1 s(2) = 2 s(3) = 2 s(4) = 3 s(5) = 1 e1 f2 A: A1=2 A2=3 A3=0 E2 f3 e2 e3 f4 F3 e4 f5 E3 source target

  14. Dependency Relation Probability

  15. Dependency Relations EAs(p) Inverted parent-child EAs(c) ・・・ Fs(c) ? ・・・ ・・・ fc Parent-child NULL Parent-child fp Fs(p) EAs(p) EAs(c) ・・・ ・・・ target source rel(fc, fp) = p rel(fc, fp) = c rel(fc, fp) = c;c rel(fc, fp) = NULL_p Grandparent-child

  16. Dependency Relation Probability • Ds-pcis a set of parent-child word pairs in the source sentence • Source-side dependency relation probability is defined in the same manner

  17. Model Training

  18. Model Training p(コロラド|Colorado)=0.7 p(大学|university)=0.6 … • Step 1:Estimate word translation prob. (IBM Model 1) • Initialize dependency relation prob. • Step 2:Estimate phrase translation prob. and dependency relation prob. • E-step • Create initial alignment • Modify the alignment by hill-climbing • Generate possible phrases • M-step: Parameter estimation Word base Tree base p(c) = 0.4 p(c;c)= 0.3 p(p) = 0.2 … p(コロラド|Colorado)=0.7 p(大学|university)=0.6 p(コロラド 大学|university of Colorado)=0.9 …

  19. Step 2 (E-step) Example of Hill-climbing 受 受 受 受 受 • Initial alignment is greedily created • Modify the initial alignment with the operations: • Swap • Reject • Add • Extend 光 光 光 光 光 A A A A A 素子 素子 素子 素子 素子 Initial Alignment photogate photogate photogate photogate photogate に に に に に is is is is is Swap は は は は は Extend Add Reject used used used used used フォト フォト フォト フォト フォト for for for for for ゲート ゲート ゲート ゲート ゲート the the the the the を を を を を photodetector photodetector photodetector photodetector photodetector 用いた 用いた 用いた 用いた 用いた

  20. Generate Possible Phrases 受 • Generate new possible phrases by merging the NULL-aligned nodes into their parent or child non-NULL-aligned nodes • The new possible phrases are taken into consideration from the next iteration 光 A 素子 photogate に is は used フォト for ゲート the を photodetector 用いた 

  21. Model Training p(コロラド|colorado)=0.7 p(大学|university)=0.6 … • Step 1:Estimate word translation prob. (IBM Model 1) • Initialize dependency relation prob. • Step 2:Estimate phrase translation prob. and dependency relation prob. • E-step • Create initial alignment • Modify the alignment by hill-climbing • Generate possible phrases • M-step: Parameter estimation Word base Tree base p(c) = 0.4 p(c;c)= 0.3 p(p) = 0.2 … p(コロラド|colorado)=0.7 p(大学|university)=0.6 p(コロラド 大学|university of colorado)=0.9 …

  22. Experiments

  23. Alignment Experiments • Training: JST Ja-En paper abstract corpus (1M sentences, Ja: 36.4M words, En: 83.6M words) • Test: 475 sentences with the gold-standard alignments annotated by hand • Parsers: KNP for Japanese, MSTParser for English • Evaluation criteria: Precision, Recall, F1 • For the proposed model, we did 5 iterations in each Step

  24. Experimental Results +1.7

  25. Effectiveness of Phrase and Tree • Positional relations instead of dependency relations c -1 p +1

  26. Discussions • Parsing errors • Parsing accuracy is basically good, but still sometimes makes incorrect parsing results • Parsing probability into the model • Search errors • Hill-climbing sometimes goes local minima • Random restart • Function words • Behave quite differently in different languages (ex. case markers in Japanese, articles in English) • Post-processing

  27. Post-processing for Function Words • Reject correspondences between Japanese particles and English “be” or “have” • Reject correspondences of English articles • Japanese “する” and “れる” or English “be” and “have” are merged into its parent verb or adjective if they are NULL-aligned +6.2 +0.3

  28. Conclusion and Future Work • Linguistically motivated phrase alignment • Dependency trees • Phrase alignment • Bi-directional agreement • Significantly better results compared to conventional word alignment models • Future work: • Apply the proposed model for other language pairs (Japanese-Chinese and so on) • Incorporate parsing probability into our model • Investigate the contribution of our alignment results to the translation quality

More Related