100 likes | 196 Views
MT in the NCLT. Andy Way NCLT, School of Computing, Dublin City University, Dublin 9, Ireland away@computing.dcu.ie www.nclt.dcu.ie/mt/. MT in the NCLT: Recent History. Marker-Based EBMT [Nano Gough, PhD 2005] Computational Linguistics 2003 NLE 2005; Machine Translation 2005
E N D
MT in the NCLT • Andy Way • NCLT, School of Computing, • Dublin City University, • Dublin 9, Ireland • away@computing.dcu.ie • www.nclt.dcu.ie/mt/
MT in the NCLT: Recent History • Marker-Based EBMT • [Nano Gough, PhD 2005] • Computational Linguistics 2003 • NLE 2005; Machine Translation 2005 • AMTA 02, MT Summit 03; TMI 04, EAMT 04 … • Data-Oriented Translation • [Mary Hearne, PhD 2005] • MT Summit 03, COLING 04, IJCNLP 04, EAMT 05, EAMT 06 … • Hybrid Approaches (EBMT & SMT) • [Declan Groves, PhD 2007] • Machine Translation 2006 • ACL 05, EAMT 06, …
MT in the NCLT: Recent History • Improving Online MT Systems (TransBooster) • [Bart Mellebeek, PhD 2007] • [Karolina Owczarzak] • MT Summit 05, AMTA 06, EAMT 05, 06 … • Automatic Translation of DVD subtitles • [Steve Armstrong, MSc 2007] • [Other students’ ongoing PhD work in SALIS] • Perspectives 06 • ASLIB 06 …
Current Research • Hybrid MT (MaTrEx) • Nicolas Stroppa et al. • AMTA 06, OpenLab 06, IWSLT 06, NIST 06, MT Summit 07 • Dependency-Based Automatic Evaluation Metrics • Karolina Owczarzak, Josef Van Genabith • MT Summit 07, Workshops at NAACL 07, ACL 07 • Integrating Syntax into SMT (Using Supertags) • Hany Hassan [& Khalil Sima’an] • IEEE SLT 06, ACL 07 … • Sign Language MT • Sara Morrissey [& RWTH Aachen] • MT Summit 05, LREC 06, MT Summit 07 …
Current Research • Word and Phrase Alignment in SMT • Yanjun Ma, Nicolas Stroppa • ACL 07 … • Sub-Tree Alignment • John Tinsley, Ventzi Zhechev, Mary Hearne • MT Summit 07 … • Parameter Estimation in MT • John Tinsley, Ventzi Zhechev, Mary Hearne [& Khalil Sima’an] • Constraint-Based MT • Yvette Graham, Josef Van Genabith
Language Pairs • FrenchEnglish (EBMT) • EnglishGerman (EBMT) • SpanishEnglish (SMT, Hybrid) • SpanishBasque (Hybrid) • ChineseEnglish (SMT, EBMT) • ArabicEnglish (SMT, Hybrid) • ItalianEnglish (Hybrid) • JapaneseEnglish (EBMT, Hybrid) • DutchEnglish (Hybrid, SMT) • Sign LanguageEnglish (Hybrid) • …
Collaboration • Tilburg (Memory-based Decoding) • Donostia (Basque MT) • Aachen (Sign-Language MT) • Amsterdam (Integrating Syntax & SMT) • Edinburgh (SMT) • CMU (Hybrid SMT—EBMT) • Toshiba Beijing (Chinese MT) • …
Future Work • MT via SMS • Automatic Interpreting • Enhanced hybrid models • Scalability • Tuning MT to text type & genre • MT using Pivot languages • Better quality phrases (cf. CONLL monolingual chunking shared task) • …
Current and Future Funding • Irish Government Sources • Science Foundation Ireland • Enterprise Ireland • IRCSET • Companies • IBM • Microsoft • Under Review • EU STREP (MT for Minority Languages) • UPC, FBK-IRST, Edinburgh … • SFI CSET in Next Generation Localisation • TCD, UCD, UL, IBM, Microsoft, Symantec …