150 likes | 291 Views
Discriminative Training Based On An Integrated View Of MPE And MMI In Margin And Error Space Erik McDermott, Shinji Watanabe and Atsushi Nakamura ICASSP 2010. Pei- ning Chen NTNU CSIE SLP Lab. Outline. Introduction Margin-based MPE, MMI, and dMMI
E N D
Discriminative Training Based On An Integrated View Of MPE And MMI In Margin And Error SpaceErik McDermott, Shinji Watanabe and Atsushi NakamuraICASSP 2010 Pei-ning Chen NTNU CSIE SLP Lab
Outline • Introduction • Margin-based • MPE, MMI, and dMMI • Macroscopic analysis using the error-indexed forward-backward algorithm • Experimental results • Conclusions
Introduction • It was shown that MPE or MPFE (Minimum Phone Frame Error) corresponds to the derivative of the margin-modified MMI objective function with respect to the margin term. • A new framework, “differenced MMI” (dMMI), was proposed in which the objective function is an integral of MPE-style loss over a given margin interval.
Margin-based MPE Rewrite the cost function in terms of pair-wise comparisons Then the modified MPE loss can be expressed as
Margin-based MMI • Using the same pair-wise comparisons
It is easy to show that MPE (margin-based or not) is the derivative of margin-based MMI with respect to σ
Differenced MMI • It is defined in terms of an integral of MPE loss over a given margin interval
Optimization based on dMMI • For a given arc q in a recognition lattice for utterance Xr, • where is the standard arc posterior probability or occupancy calculated with the Forward-Backward algorithm. • The corresponding lattice arc occupancies are subtracted and divided by σ2 − σ1:
Optimization based on dMMI • The total gradient for all parameter components Λi, summed over all training utterances and all Qr arcs in each utterance’s recognition lattice, can then be calculated
The error-indexed forward-backward algorithm • An aggregate probability mass for all lattice strings with the same total error count j : • The corresponding margin-modified error group occupancy is
The standard (σ = 0) error group MPE derivative is • The aggregated dMMIderivative is
Conclusion • A new approach for DT, “differenced MMI”. • Experiments confirmed that a close approximation to MPE can be implemented using dMMI. • Aggregate error-group statistics show that the choice of interval affects the relative weighting of different error levels during training. • The proper choice of margin interval is a topic for future research.