210 likes | 395 Views
Tight Coupling between ASR and MT in Speech-to-Speech Translation. Arthur Chan Prepared for Advanced Machine Translation Seminar. This Seminar. Introduction (4 slides). A Conceptual Model of Speech-to-Speech Translation. Speech Recognizer. Machine Translator. Speech Synthesizer.
E N D
Tight Coupling between ASR and MT in Speech-to-Speech Translation Arthur Chan Prepared for Advanced Machine Translation Seminar
This Seminar • Introduction (4 slides)
A Conceptual Model of Speech-to-Speech Translation Speech Recognizer Machine Translator Speech Synthesizer Decoding Result(s) Translation waveforms waveforms
Motivation of Tight Coupling between ASR and MT • One best of ASR could be wrong • MT could be benefited from wide range of supplementary information provided by ASR • N-best list • Lattice • Sentenced/Word-based Confidence Scores • E.g. Word posterior probability • Confusion network • Or consensus decoding (Mangu 1999) • Some observed that • MT quality depends on WER.
Scope of this talk Speech Recognizer Machine Translator Speech Synthesizer 1-best? N-best? Translation waveforms waveforms Lattice? Confusion network? 1, Should we combine the two? 2, How tight should be the coupling?
Topics Covered Today • The concept of Coupling • The “tightness” of coupling between ASR and X • (Ringger 95) • Interfaces between ASR and MT in loose coupling • What could ASR provide? • What could MT use? • Very tight coupling • Ney’s formulae • AT&T Approach • Combination of features of ASR and MT • Direct Modeling
Classification of Coupling of ASR and Natural Language Understanding (NLU) • Proposed in Ringger 95, Harper 94 • 3 Dimensions of ASR/NLU • Complexity of the search algorithm • Simple N-gram? • Incrementality of the coupling • On-line? Left-to-right? • Tightness of the coupling • Tight? Loose? Semi-tight?
Tightness of Coupling Tight Semi-Tight Loose
Implication on ASR/MT coupling • Generalize many systems • Loose coupling • Any system which uses 1-best, n-best, lattice for 1-way module communication • Tight coupling • AT&T FST-based system • Semi-tight coupling • [Filled in a quote here]
Perspectives • What output could an ASR generates? • Not all of them are used but it could mean opportunity in future. • What algorithms could MT uses given a certain inputs? • On-line algorithm is a focus
Decoding of HMM-based ASR • Decoding of HMM-based ASR • Searching the best path in a huge HMM-state lattice. • 1-best ASR result • The best path one could find from backtracking. • State Lattice (Next page)
Things one could extract from the state lattice • From the backtracking information: • N-best list • The N best decoding results from the state lattice • Lattice • A lattice of the decoding but in the word level • From the lattice • N-best list • Confusion network. • Or “consensus decoding” (Mangu 99)
Other things one could extract from the decoder • Begin time and end time • Useful in time-sensitive application • E.g. multi-modal applications • Sentence/Word-based Confidence Scores • Found to be pretty useful in many other occasions
How MT used the output? • What decoding algorithms are using?
Literature Eric K. Ringger, “A Robust Loose Coupling for Speech Recognition and Natural Language Understanding”, Technical Report 592, Computer Science Department, Rochester University, 1995 [The AT&T paper]