Robust Speech recognition

Robust Speech recognition V. Barreaud LORIA

Mismatch Between Training and Testing • mismatch influences scores • causes of mismatch • Speech Variation • Inter-Speaker Variation

Robust Approaches • three categories • noise resistant features (Speech var.) • speech enhancement (Speech var. + Inter-speaker var.) • model adaptation for noise (Speech var. + Inter-speaker var.) Recognition system Models training Features encoding testing Spk. B Word sequence Spk. A

Contents • Overview • Noise resistant features • Speach enhancement • Model adaptation • Stochastic Matching • Our current work

Noise resistant features • Acoustic representation • Emphasis on less affected evidences • Auditory systems inspired models • Filter banks, Loudness curve, Lateral inhibition • Slow variation removal • Cepstrum Mean Normalization, Time derivatives • Linear Discriminative Analysis • Searches for the best parameterization

Speech enhancement • Parameter mapping • stereo data • observation subspace • Bayesian estimation • stochastic modelization of speech and noise • Template based estimation • restriction to a subspace • output is noise free • various templates and combination methods • Spectral Subtraction • noise and speech uncorrelated • slowly varying noise

Model Adaptation for noise • Decomposition of HMM or PMC • Viterbi algorithm searches in a NxM state HMM • Noise and speech simultaneously recognized • complex noises recognized • State dependant Wiener filtering • Wiener filtering in spectral domain faces non-stationary • Hmms divide speech in quasi-stationary segments • wiener filters specific to the state • Discriminative training • Classical technique trains models independently • error corrective training • minimum classification error training • Training data contamination • training set corrupted with noisy speech • depends on the test environment • lower discriminative scores Training

Stochastic Matching : Introduction • General framework • in feature space • in model space

W n Y G Stochastic Matching : General framework • HMM Models X, X training space • Y ={y1, …, yt}observation in testing space • and

Stochastic Matching : In Feature Space • Estimation step : Auxiliary function • Maximization step

Stochastic Matching : In Feature Space (2) • Simple distorsion function • Computation of the simple bias

Stochastic Matching : In Model Space • random additive bias sequence B={b1,…,bt} independent of speech stochastic process of mean b and diagonal covariance b

On-Line Frame-Synchronous Noise Compensation • Lies on stochastic matching method • Transformation parameter estimated along with optimal path. • Uses forward probabilities Bias computation b1 b2 b3 b4 reco reco reco Transformed observations z2 z3 z4 z5 y4 y2 y3 Sequence of observations

Theoretical framework and issue • On line frame synchronous • cascade of errors • Classical Stochastic Matching 1. Initiate bias of first frame b0=0 2. Compute  and then b 3. Transform next frame with b 4. Goto next frame

Viterbi Hypothesis vs Linear Combination • Viterbi Hypothesis take into account only the « most probable » state and gaussian component. • Linear combination states t t+1

Experiments • Phone numbers in a running car • Forced Align • transcription + optimum path • Free Align • optimum path • Wild Align • no data

Perspectives • Error recovery problem • a forgetting process • a model of distorsion function • environmental clues • More elaborated transform

Robust Speech recognition

Robust Speech recognition

Presentation Transcript

Speech Recognition

Speech Recognition

ROBUST SIGNAL REPRESENTATIONS FOR AUTOMATIC SPEECH RECOGNITION

Robust Recognition of Emotion from Speech

Speech Recognition

Speech recognition

Speech Recognition

Speech Recognition

Histogram-based Quantization for Distributed / Robust Speech Recognition

Speech Recognition

MODULATION SPECTRUM EQUALIZATION FOR ROBUST SPEECH RECOGNITION

Speech Recognition

Enhanced Speech Models for Robust Speech Recognition

Speech Recognition

SPEECH RECOGNITION:

Robust Automatic Speech Recognition by Transforming Binary Uncertainties

Speech Recognition

CMU Robust Vocabulary-Independent Speech Recognition System

Speech Recognition

Prosodic Constraints for Robust Speech Recognition

A Feature Weighting Method for Robust Speech Recognition

Speech Recognition