1 / 75

Mixed syntax translation

Hieu Hoang MT Marathon 2011 Trento. Mixed syntax translation. Contents. What is a syntactic model? What’s wrong with Syntax? Which syntax model to use? Why use syntactic models? Mixed-Syntax Model Extraction Decoding Results Future Work. What is a syntactic model?.

hester
Download Presentation

Mixed syntax translation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hieu Hoang MT Marathon 2011 Trento Mixed syntax translation

  2. Contents • What is a syntactic model? • What’s wrong with Syntax? • Which syntax model to use? • Why use syntactic models? • Mixed-Syntax Model • Extraction • Decoding • Results • Future Work

  3. What is a syntactic model? • Hierarchical Phrase-Based Model • String-to-string • Non-terminals are unlabelled • Tree-to-string Model • Source non-terminals are labelled • match input parse tree X habe X1gegessen # have eaten X1 ShabeNP1gegessen # have eaten NP1

  4. What is a syntactic model? • Hierarchical Phrase-Based Model • String-to-string • Non-terminals are unlabelled • Tree-to-string Model • Source non-terminals are labelled • match input parse tree X habe X1gegessen# have eaten X1 ShabeNP1gegessen # have eaten NP1

  5. What is a syntactic model? • Hierarchical Phrase-Based Model • String-to-string • Non-terminals are unlabelled • Tree-to-string Model • Source non-terminals are labelled • match input parse tree X habe X1gegessen # have eaten X1 ShabeNP1gegessen # have eaten NP1

  6. What is a syntactic model? • Hierarchical Phrase-Based Model • String-to-string • Non-terminals are unlabelled • Tree-to-string Model • Source non-terminals are labelled • match input parse tree XhabeX1gegessen # have eaten X1 ShabeNP1gegessen # have eaten NP1

  7. What is a syntactic model? • Hierarchical Phrase-Based Model • String-to-string • Non-terminals are unlabelled • Tree-to-string Model • Source non-terminals are labelled • match input parse tree X habe X1gegessen # have eaten X1 ShabeNP1gegessen # have eaten NP1

  8. Contents • What is a syntactic model? • What’s Wrong with Syntax? • Which syntax model to use? • Why use syntactic models? • Mixed-Syntax Model • Extraction • Decoding • Results • Future Work

  9. What’s Wrong with Syntax? Evaluation of French-English MT System (Ambati and Lavie, 2009)

  10. Hierarchical Model according to JánosVeres , this would be in the first quarter of 2008 possible .

  11. Hierarchical Model according to JánosVeres , this would be in the first quarter of 2008 possible .

  12. Hierarchical Model according to JánosVeres , this would be in the first quarter of 2008 possible .

  13. Tree-to-String Model NE NE VAFIN PDS APPRART ADJA NN CARD ADJA APPR PUNC PN PP PP S TOP

  14. Tree-to-String Model János Veres would be this in the first quarter of 2008 possible according to . NE NE VAFIN PDS APPRART ADJA NN CARD ADJA APPR PUNC PN PP PP S TOP

  15. Tree-to-String Model János Veres would be this in the first quarter of 2008 possible according to . NE NE VAFIN PDS APPRART ADJA NN CARD ADJA APPR PUNC PN PP PP [X,1] [X,2] [X,1] [X,2] [X,1] [X,2] [X,1] [X,2] [X,1] [X,2] [X,1] [X,2] [X,1] [X,2] [X,1] [X,2] [X,1] [X,2] S TOP

  16. Contents • What’s Wrong with Syntax? • Which syntax model to use? • Why use syntactic models? • Mixed-Syntax Model • Extraction • Decoding • Results • Future Work

  17. Other Syntactic Models • Syntax-Augmented MT (SAMT) • Not constrained only to parse tree • (Zollmann and Venugopal, 2006) • Binarization • Restructure and relabel parse parse tree • (Wang et al, 2010) • Forest-based translation • Recover from parse errors • (Mi et al, 2008) • Soft constraint • Reward/Penalize derivations which follows parse structure • (Chiang 2010)

  18. Other Syntactic Models • Syntax-Augmented MT (SAMT) • Not constrained only to parse tree • (Zollmann and Venugopal, 2006) • Binarization • Restructure and relabel parse parse tree • (Wang et al, 2010) • Forest-based translation • Recover from parse errors • (Mi et al, 2008) • Soft constraint • Reward/Penalize derivations which follows parse structure • (Chiang 2010)

  19. Other Syntactic Models • Syntax-Augmented MT (SAMT) • Not constrained only to parse tree • (Zollmann and Venugopal, 2006) • Binarization • Restructure and relabel parse parse tree • (Wang et al, 2010) • Forest-based translation • Recover from parse errors • (Mi et al, 2008) • Soft constraint • Reward/Penalize derivations which follows parse structure • (Chiang 2010)

  20. Other Syntactic Models • Syntax-Augmented MT (SAMT) • Not constrained only to parse tree • (Zollmann and Venugopal, 2006) • Binarization • Restructure and relabel parse parse tree • (Wang et al, 2010) • Forest-based translation • Recover from parse errors • (Mi et al, 2008) • Soft constraint • Reward/Penalize derivations which follows parse structure • (Chiang 2010)

  21. Other Syntactic Models • Syntax-Augmented MT (SAMT) • Not constrained only to parse tree • (Zollmann and Venugopal, 2006) • Binarization • Restructure and relabel parse parse tree • (Wang et al, 2010) • Forest-based translation • Recover from parse errors • (Mi et al, 2008) • Soft constraint • Reward/Penalize derivations which follows parse structure • (Chiang 2010)

  22. Contents • What’s Wrong with Syntax? • Which syntax model to use? • Why use syntactic models? • Mixed-Syntax Model • Extraction • Decoding • Results • Future Work

  23. Why Use Syntactic Models? • Decrease decoding time • Derivation constrained by source parse tree

  24. Why Use Syntactic Models? • Decrease decoding time • Derivation constrained by source parse tree • Long-range reordering during decoding • rules covering more words than max-span limit

  25. Why Use Syntactic Models? • Decrease decoding time • Derivation constrained by source parse tree • Long-range reordering during decoding • rules covering more words than max-span limit • Other rule-forms • 3+ non-terminals • consecutive non-terminals • non-lexicalized rules

  26. Why Use Syntactic Models? • Decrease decoding time • Derivation constrained by source parse tree • Long-range reordering during decoding • rules covering more words than max-span limit • Other rule-forms • 3+ non-terminals • consecutive non-terminals • non-lexicalized rules X  S1O2V3# S1V3O2 X  PRO1 PRO2 aime bien # PRO1 like PRO2

  27. Contents • What’s Wrong with Syntax? • Which syntax model to use? • Why use syntactic models? • Mixed-Syntax Model • Extraction • Decoding • Results • Future Work

  28. Other Syntactic Models • Syntax-Augmented MT (SAMT) • Not constrained only to parse tree • (Zollmann and Venugopal, 2006) • Binarization • Restructure and relabel parse parse tree • (Wang et al, 2010) • Forest-based translation • Recover from parse errors • (Mi et al, 2008) • Soft constraint • Reward/Penalize derivations which follows parse structure • (Chiang 2010)

  29. Other Syntactic Models • Syntax-Augmented MT (SAMT) • Not constrained only to parse tree • (Zollmann and Venugopal, 2006) • Binarization • Restructure and relabel parse parse tree • (Wang et al, 2010) • Forest-based translation • Recover from parse errors • (Mi et al, 2008) • Soft constraint • Reward/Penalize derivations which follows parse structure • (Chiang 2010) • Ignore Syntax (occasionally)

  30. Mixed-Syntax Model • Tree-to-string model • input is a parse tree • Roles of non-terminals • Constrain derivation to parse constituents • State information • Consistent node label on target derivation • hypotheses with different head NT cannot be recombined

  31. Mixed-Syntax Model • Tree-to-string model • input is a parse tree • Roles of non-terminals • Constrain derivation to parse constituents • Can sometime have no constraints • State information • Consistent node label on target derivation • hypotheses with different head NT cannot be recombined • always X

  32. Mixed-Syntax Model Example Translation Rules • Naïve syntax model • Mixed-Syntax Model VP  VVFIN1zu VVINF2 # to VVFIN2 VVINF1 VP X1zu VVINF2 # X to X2 X1

  33. Mixed-Syntax Model Example Translation Rules • Naïve syntax model • Mixed-Syntax Model VP  VVFIN1zu VVINF2 # to VVFIN2 VVINF1 VP X1zu VVINF2 # X to X2X1

  34. Mixed-Syntax Model Example Translation Rules • Naïve syntax model • Mixed-Syntax Model VP  VVFIN1zu VVINF2 # to VVFIN2 VVINF1 VP X1zu VVINF2 # X to X2 X1

  35. Contents • What’s Wrong with Syntax? • Which syntax model to use? • Why use syntactic models? • Mixed-Syntax Model • Extraction • Decoding • Results • Future Work

  36. Extraction • Allow rules • Max 3 non-terminals • Adjacent non-terminals • At least 1 NT must be syntactic • Non-lexicalized rules

  37. Example Rules Extracted

  38. Contents • What’s Wrong with Syntax? • Which syntax model to use? • Why use syntactic models? • Mixed-Syntax Model • Extraction • Decoding • Results • Future Work

  39. Synchronous CFG Input:

  40. Synchronous CFG Input: Rules: S NP1 VP2 # NP1 VP2 NP je # I PRO lui # him VB vu # see VP ne PRO1 VB2 pas # did not VB2 PRO1

  41. Synchronous CFG Input: Derivation: Rules: S NP1 VP2 # NP1 VP2 NP je # I PRO lui # him VB vu # see VP ne PRO1 VB2 pas # did not VB2 PRO1

  42. Synchronous CFG Input: Derivation: Rules: S NP1 VP2 # NP1 VP2 NP je # I PRO lui # him VB vu # see VP ne PRO1 VB2 pas # did not VB2 PRO1

  43. Synchronous CFG Input: Derivation: Rules: S NP1 VP2 # NP1 VP2 NP je # I PRO lui # him VB vu # see VP ne PRO1 VB2 pas # did not VB2 PRO1

  44. Synchronous CFG Input: Derivation: Rules: S NP1 VP2 # NP1 VP2 NP je # I PRO lui # him VB vu # see VP ne PRO1 VB2 pas # did not VB2 PRO1

  45. Synchronous CFG Input: Derivation: Rules: S NP1 VP2 # NP1 VP2 NP je # I PRO lui # him VB vu # see VP ne PRO1 VB2 pas # did not VB2 PRO1

  46. Synchronous CFG Input: Derivation: Rules: S NP1 VP2 # NP1 VP2 NP je # I PRO lui # him VB vu # see VP ne PRO1 VB2 pas # did not VB2 PRO1

  47. Mixed-Syntax Model Input:

  48. Mixed-Syntax Model Input: Rules: S NP1 VP2 # X  NP1 VP2 PRO je # X  I PRO lui # X him VB vu # X  see VP ne X1 pas # did not X1 X  PRO1 VB2 # X  X2 X1

  49. Mixed-Syntax Model Derivation: Input: Rules: S NP1 VP2 # X  NP1 VP2 PRO je # X  I PRO lui # X him VB vu # X  see VP ne X1 pas # did not X1 X  PRO1 VB2 # X  X2 X1

  50. Mixed-Syntax Model Derivation: Input: Rules: S NP1 VP2 # X  NP1 VP2 PRO je # X  I PRO lui # X him VB vu # X  see VP ne X1 pas # did not X1 X  PRO1 VB2 # X  X2 X1

More Related