360 likes | 576 Views
A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼ , Min Zhang ╪ , Chew Lim Tan ┼. ┼. ╪. Outline. Introduction Non-contiguous Tree Sequence Modeling Rule Extraction Non-contiguous Decoding: the Pisces Decoder Experiments Conclusion.
E N D
A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine TranslationJun Sun┼, Min Zhang╪, Chew Lim Tan┼ ┼ ╪
Outline • Introduction • Non-contiguous Tree Sequence Modeling • Rule Extraction • Non-contiguous Decoding: the Pisces Decoder • Experiments • Conclusion
Contiguous and Non-contiguousBilingual Phrases Non-contiguous translational equivalence Contiguous translational equivalences
Previous Work on Non-contiguous phrases • (-) Zhang et al. (2008) acquire the non-contiguous phrasal rules from the contiguous tree sequence pairs, and find them useless via real syntax-based translation systems. • (+) Wellington et al. (2006) statistically report that discontinuities are very useful for translational equivalence analysis using binary branching structures under word alignment and parse tree constraints. • (+) Bod (2007) also finds that discontinues phrasal rules make significant improvement in linguistically motivated STSG-based translation model.
Previous Work on Non-contiguous phrases (cont.) VP(VV(到),NP(CP[0],NN(时候))) SBAR(WRB(when),S[0]) Non-contiguous Contiguous tree sequence pair Contiguous tree sequence pair
Previous Work on Non-contiguous phrases (cont.) No match in rule set
Proposed Non-contiguous phrases Modeling . . . Extracted from non-contiguous tree sequence pairs
Contributions • The proposed model extracts the translation rules not only from the contiguous tree sequence pairs but also from the non-contiguous tree sequence pairs (with gaps). With the help of the non-contiguous tree sequence, the proposed model can well capture the non-contiguous phrases in avoidance of the constraints of large applicability of context and enhance the non-contiguous constituent modeling. • A decoding algorithm for non-contiguous phrase modeling
Outline • Introduction • Non-contiguous Tree Sequence Modeling • Rule Extraction • Non-contiguous Decoding: the Pisces Decoder • Experiments • Conclusion
SncTSSG Synchronous Tree Substitution Grammar (STSG, Chiang, 2006) Synchronous Tree Sequence Substitution Grammar (STSSG, Zhang et al. 2008) Synchronous non-contiguous Tree Sequence Substitution Grammar (SncTSSG)
Word Aligned Parse Tree and Two Parse Tree Sequence VBA VO e e r t b u s P NG VG R 给 我 把 钢笔 a b s t r a c t S u b s t r u VBA c t u r e VO VG R P NG 给 把 1. Word-aligned bi-parsed Tree 2. Two Structure 3. Two Tree Sequences
Contiguous Translation Rules r1. Contiguous Tree-to-Tree Rule r2. Contiguous Tree Sequence Rule
Non-contiguous Translation Rules r1. Non-contiguous Tree-to-Tree Rule r2. Non-contiguous Tree Sequence Rule
Outline • Introduction • Non-contiguous Tree Sequence Modeling • Rule Extraction • Non-contiguous Decoding: the Pisces Decoder • Experiments • Conclusion
Example for contiguous rule extraction(4) Abstract into substructures
Example for non-contiguous rule extraction(1) Extracted from non-contiguous tree sequence pairs
Example for non-contiguous rule extraction(2) Abstract into substructures from non-contiguous tree sequence pairs
Outline • Introduction • Non-contiguous Tree Sequence Modeling • Rule Extraction • Non-contiguous Decoding: the Pisces Decoder • Experiments • Conclusion
The Pisces Decoder • Pisces conducts searching by the following two modules • The first one is a CFG-based chart parser as a pre-processor for mapping an input sentence to a parse tree Ts (for details of chart parser, please refer to Charniak (1997)) • The second one is a span-based tree decoder (3 phases) • Contiguous decoding (same with Zhang et al. 2008) • Source side non-contiguous translation • Tree sequence reordering in Target side
Source side non-contiguous translation • Source gap insertion Right insertion: Left insertion: NP(...) IN(in) NP(...)
Tree sequence reordering in Target side • Binarize each span into the left one and the right one. • Generating the new translation hypothesis for this span by inserting the candidate translations of the right span to each gap in the ones of the left span. • Generating the translation hypothesis for this span by inserting the candidate translations of the left span to each gap in the ones of the right span. A candidate hypo taget span with gaps Right span Left span
Modeling • : source/target sentence • : source/target parse tree • : a non-contiguous source/target tree sequence • : source/target spans • hm : the feature function
Features • The bi-phrasal translation probabilities • The bi-lexical translation probabilities • The target language model • The # of words in the target sentence • The # of rules utilized • The average tree depth in the source side of the rules adopted • The # of non-contiguous rules utilized • The # of reordering times caused by the utilization of the non-contiguous rules
Outline • Introduction • Non-contiguous Tree Sequence Modeling • Rule Extraction • Non-contiguous Decoding: the Pisces Decoder • Experiments • Conclusion
Experimental settings • Training Corpus: • Chinese-English FBIS corpus • Development Set: • NIST MT 2002 test set • Test Set: • NIST MT 2005 test set • Evaluation Metrics: • case-sensitive BLEU-4 • Parser: • Stanford Parser (Chinese/English) • Evaluation: • mteval-v11b.pl • Language Model: • SRILM 4-gram • Minimum error rate training: • (Och, 2003) • Model Optimization: • Only allow gaps in one side
Model comparison in BLEU Table 1: Translation results of different models (cBP refers to contiguous bilingual phrases without syntactic structural information, as used in Moses)
Rule combination cR: rules derived from contiguous tree sequence pairs (i.e., all STSSG rules) ncPR: non-contiguous rules derived from contiguous tree sequence pairs with at least one non-terminal leaf node between two lexicalized leaf nodes srcncR: non-contiguous rules with gaps in the source side tgtncR: non-contiguous rules with gaps in the target side src&tgtncR : non-contiguous rules with gaps in either side Table 2: Performance of different rule combination
Bilingual Phrasal Rules cR: rules derived from contiguous tree sequence pairs (i.e., all STSSG rules) ncPR: non-contiguous rules derived from contiguous tree sequence pairs with at least one non-terminal leaf node between two lexicalized leaf nodes srcncBP: non-contiguous phrasal rules with gaps in the source side tgtncBP: non-contiguous phrasal rules with gaps in the target side src&tgtncBP : non-contiguous phrasal rules with gaps in either side Table 3: Performance of bilingual phrasal rules
Maximal number of gaps • Table 4: Performance and rule size changing with different maximal number of gaps
Conclusion • Able to attain better ability of non-contiguous phrase modeling and the reordering caused by non-contiguous constituents with large gaps from • Non-contiguous tree sequence alignment model based on SncTSSG • Observations • In Chinese-English translation task, gaps are more effective in Chinese side than in the English side. • Allowing one gap only is effective • Future Work • Redundant non-contiguous rules • Optimization of the large rule set