Phrase Extraction in PB-SMT

Phrase Extraction in PB-SMT Ankit K Srivastava NCLT/CNGL Presentation: May 6, 2009

About • Phrase-based statistical machine translation • Methods for phrase extraction • Phrase induction via percolated dependencies • Experimental setup & evaluation results • Other facts & figures • Moses customization • Ongoing & future work • Endnote Phrase Extraction | Ankit | 6-May-09

PB-SMT Modeling • Phrase-based statistical machine translation • Methods for phrase extraction • Phrase induction via percolated dependencies • Experimental setup & evaluation results • Other facts & figures • Moses customization • Ongoing & future work • Endnote Phrase Extraction | Ankit | 6-May-09

PB-SMT • Process sequence of words as opposed to mere words • Segment input, translate input, reorder output • Translation model, Language Model, Decoder • argmaxe p(e|f) = argmaxep(f|e) p(e) Phrase Extraction | Ankit | 6-May-09

Learning Phrase Translations • Phrase-based statistical machine translation • Methods for phrase extraction • Phrase induction via percolated dependencies • Experimental setup & evaluation results • Other facts & figures • Moses customization • Ongoing & future work • Endnote Phrase Extraction | Ankit | 6-May-09

Extraction I • Input is sentence-aligned parallel corpora • Most approaches use word alignments • Extract (learn) phrase pairs • Build a phrase translation table Phrase Extraction | Ankit | 6-May-09

Extraction II [Koehn et al., ’03] • Get word alignments (src2tgt, tgt2src) • Perform grow-diag-final heuristics • Extract phrase pairs consistent with the word alignments • Non-syntactic phrases :: STR Phrase Extraction | Ankit | 6-May-09

Extraction III • Sentence-aligned and word-aligned text • Monolingual parsing of both SRC & TGT • Align subtrees and extract string pairs • Syntactic phrases Phrase Extraction | Ankit | 6-May-09

Extraction IV [Tinsley et al., ’07] • Parse using constituency parser • Phrases are syntactic constituents :: CON (ROOT (S (NP (NNP Vinken)) (VP (MD will) (VP (VB join) (NP (DT the) (NN board)) (PP (IN as) (NP (DT a) (JJ nonexecutive) (NN director))) (NP (NNP Nov) (CD 29)))))) Phrase Extraction | Ankit | 6-May-09

Extraction V [Hearne et al., ’08] • Parse using dependency parser • Phrases have head-dependent relationships :: DEP HEAD DEPENDENT join Vinken join will board the join board join as director a director nonexecutive as director 29 Nov join 29 Phrase Extraction | Ankit | 6-May-09

Extraction VI • Numerous other phrase extractions • Estimate phrase translations directly [Marcu & Wong ’02] • Use heuristic other than grow-diag-final • Use marker-based chunks [Groves & Way ’05] • String-to-String translation models herein Phrase Extraction | Ankit | 6-May-09

Head Percolation and Phrase Extraction • Phrase-based statistical machine translation • Methods for phrase extraction • Phrase induction via percolated dependencies • Experimental setup & evaluation results • Other facts & figures • Moses customization • Ongoing & future work • Endnote Phrase Extraction | Ankit | 6-May-09

Percolation I • It is straightforward to convert constituency tree to an unlabeled dependency tree [Gaifman ’65] • Use head percolation tables to identify head child in a constituency representation [Magerman ’95] • Dependency tree is obtained by recursively applying head child and non-head child heuristics [Xia & Palmer ’01] Phrase Extraction | Ankit | 6-May-09

Percolation II (NP (DT the) (NN board)) NP right NN/NNP/CD/JJ (NP-board (DT the) (NN board)) the is dependent on board Phrase Extraction | Ankit | 6-May-09

Percolation III (ROOT (S (NP (NNP Vinken)) (VP (MD will) (VP (VB join) (NP (DT the) (NN board)) (PP (IN as) (NP (DT a) (JJ nonexecutive) (NN director))) (NP (NNP Nov) (CD 29)))))) HEAD DEPENDENT join Vinken join will board the join board join as director a director nonexecutive as director 29 Nov join 29 NP right NN / NNP / CD / JJ PP left IN / PP S right VP / S VP left VB / VP INPUT OUTPUT Phrase Extraction | Ankit | 6-May-09

Percolation IV • cf. slide - Extraction III (syntactic phrases) • Parse by applying head percolation tables on constituency-annotated trees • Align trees, extract surface chunks • Phrases have head-dependent relations :: PERC Phrase Extraction | Ankit | 6-May-09

Tools, Resources, and MT System Performance • Phrase-based statistical machine translation • Methods for phrase extraction • Phrase induction via percolated dependencies • Experimental setup & evaluation results • Other facts & figures • Moses customization • Ongoing & future work • Endnote Phrase Extraction | Ankit | 6-May-09

System setup I Phrase Extraction | Ankit | 6-May-09

System setup II • All 4 “systems” are run with the same configurations (with MERT tuning) on 2 different datasets • They only differ in their phrase tables (# chunks) Phrase Extraction | Ankit | 6-May-09

System setup III Phrase Extraction | Ankit | 6-May-09

Analyzing Str, Con, Dep, and Perc • Phrase-based statistical machine translation • Methods for phrase extraction • Phrase induction via percolated dependencies • Experimental setup & evaluation results • Other facts & figures • Moses customization • Ongoing & future work • Endnote Analysis w.r.t. Europarl data only Phrase Extraction | Ankit | 6-May-09

Analysis I • No. of common & unique phrase pairs • Maybe we should combine the phrase tables… Phrase Extraction | Ankit | 6-May-09

Concatenate phrase tables and re-estimate probabilities 15 different phrase table combinations: ∑4Cr , 1≤r≤4 STR + CON + DEP + PERC Analysis II Phrase Extraction | Ankit | 6-May-09

Analysis III • All 15 “systems” are run with the same configurations (with MERT tuning) • They only differ in their phrase tables • This is combining at “translation model” level Phrase Extraction | Ankit | 6-May-09

Analysis IV Performance on Europarl Phrase Extraction | Ankit | 6-May-09

Analysis V • REF: Does the commission intend to seek more transparency in this area? • S: Will the commission ensure that more than transparency in this respect? • C: The commission will the commission ensure greater transparency in this respect? • D: The commission will the commission ensure greater transparency in this respect? • P: Does the commission intend to ensure greater transparency in this regard? • SC: Will the commission ensure that more transparent in this respect? • SD: Will the commission ensure that more transparent in this respect? • SP: Does the commission intend to take to ensure that more than openness in this regard? • CD: The commission will the commission ensure greater transparency in this respect? • CP: The commission will the commission ensure greater transparency in this respect? • DP: The commission will the commission ensure greater transparency in this respect? • SCD: Does the commission intend to take to ensure that more transparent commit? • SCP: Does the commission intend to take in this regard to ensure greater transparency? • SDP: Does the commission intend to take in this regard to ensure greater transparency? • CDP: The commission will the commission ensure greater transparency in this respect? • SCDP: Does the commission intend to take to ensure that more transparent suspected? Phrase Extraction | Ankit | 6-May-09

Analysis VI • Which phrases does the decoder use? • Decoder trace on S+C+D+P • Out of 11,748 phrases: S(5204); C(2441); D(2319); P(2368) Phrase Extraction | Ankit | 6-May-09

Analysis VII • Automatic per-sentence evaluation using TER on testset of 2000 sentences [Snover et al., ’06] C (1120); P (331); D (301); S (248) • Manual per-sentence evaluation on a random testset of 100 sentences using pairwise system comparison P=C (27%); P>D (5%); SC>SCP(11%) Phrase Extraction | Ankit | 6-May-09

Analysis VIII • Treat the different phrase table combinations as individual MT systems • Perform system combination using MBR-CN framework [Du et al., 2009] • This is combining at “system” level Phrase Extraction | Ankit | 6-May-09

Analysis IX • Using Moses baseline phrases (STR) is essential for coverage. SIZE matters! • However, adding any system to STR increases baseline score. Symbiotic! • Hence, do not replace STR, but supplement it. Phrase Extraction | Ankit | 6-May-09

Analysis X • CON seems to be the best combination with STR (S+C seems to be the best performing system) • Has most common chunks with PERC • Does PERC harm a CON system – needs more analysis (bias between CON & PERC) Phrase Extraction | Ankit | 6-May-09

Analysis XI • DEP is different from PERC chunks, despite being equivalent in syntactic representation • DEP can be substituted by PERC • Difference between knowledge induced from dependency and constituency. A different aligner? Phrase Extraction | Ankit | 6-May-09

Analysis XII • PERC is a unique knowledge source. Is it just a simple case of parser combination? • Sometimes, it helps. • Needs more work on finding connection with CON / DEP Phrase Extraction | Ankit | 6-May-09

Customizing Moses for syntax-supplemented phrase tables • Phrase-based statistical machine translation • Methods for phrase extraction • Phrase induction via percolated dependencies • Experimental setup & evaluation results • Other facts & figures • Moses customization • Ongoing & future work • Endnote Phrase Extraction | Ankit | 6-May-09

MOSES Moses customization • Incorporating syntax (CON, DEP, PERC) • Reordering model • Phrase scoring (new features) • Decoder Parameters • Log-linear combination of T-tables • Good phrase translations may be lost by the decoder. How can we ensure they remain intact? Phrase Extraction | Ankit | 6-May-09

Work in ProgressandFuture Plans • Phrase-based statistical machine translation • Methods for phrase extraction • Phrase induction via percolated dependencies • Experimental setup & evaluation results • Other facts & figures • Moses customization • Ongoing & future work • Endnote Phrase Extraction | Ankit | 6-May-09

Ongoing & future work • Scaling (data size) (lang. pair) (lang. dir.) • Bias between CON & PERC • Combining Phrase pairs • Combining Systems • Classify performance into sentence types • Improve quality of phrase pairs in PBSMT Phrase Extraction | Ankit | 6-May-09

Endnote… • Phrase-based statistical machine translation • Methods for phrase extraction • Phrase induction via percolated dependencies • Experimental setup & evaluation results • Other facts & figures • Moses customization • Ongoing & future work • Endnote Phrase Extraction | Ankit | 6-May-09

Endnote • Explored 3 linguistically motivated phrase extractions against Moses phrases • Improves baseline. Highest recorded is 10% relative increase in BLEU on 100K • Rather than pursuing ONE way, combine options • Need more analysis of supplementing phrase table with multiple syntactic T-tables Phrase Extraction | Ankit | 6-May-09

Thank You! Phrase Extraction | Ankit | 6-May-09

Phrase Extraction in PB-SMT Phrase-based Statistical Machine Translation (PB-SMT) models – the most widely researched paradigm in MT today – rely heavily on the quality of phrase pairs induced from large amounts of training data. There are numerousmethods for extracting these phrase translations from parallel corpora. In this talk I will describe phrase pairs induced from percolated dependencies and contrast them with three pre-existing phrase extractions. I will also present the performance of the individual phrase tables and their combinations in a PB-SMT system. I will then conclude with ongoing experiments and future research directions. Phrase Extraction | Ankit | 6-May-09

Thanks! Andy Way John Tinsley Sylwia Ozdowska Sergio Penkale Patrik Lambert Jinhua Du Ventisislav Zhechev Phrase Extraction | Ankit | 6-May-09

Phrase Extraction in PB-SMT

Phrase Extraction in PB-SMT

Presentation Transcript

Noun Phrase Extraction

----SMT

pre-ordering dependency subtreeS for phrase-based smt

Elliptic flow of  meson in Pb-Pb

Statistical Machine Translation Part III – Phrase- based SMT / Decoding

Hyperon production in Pb+Pb collisions @

Statistical Machine Translation Part V – Phrase-based SMT

Non-contiguity phrase-based SMT

The State of the Art in Phrase-Based Statistical Machine Translation (SMT)

“ SMT=2 ” means “ smt enabled? ”

Machine Translation Decoder for Phrase-Based SMT

Applying Key Phrase Extraction to aid Invalidity Search

Improving SMT with Phrase to Phrase Translations

Machine Translation Decoder for Phrase-Based SMT

Modified Distortion Matrices for Phrase-Based SMT

Guide to SMT Equipment Fume Extraction Systems