1 / 30

Higher Order Cepstral Moment Normalization (HOCMN) for Robust Speech Recognition

Higher Order Cepstral Moment Normalization (HOCMN) for Robust Speech Recognition. Speaker: Chang-wen Hsu Advisor: Lin-shan Lee 2007/02/08. Outline. Introduction CMS/CMVN/HEQ Higher Order Cepstral Moment Normalization (HOCMN) Even order HOCMN Odd order HOCMN Cascade system

bunme
Download Presentation

Higher Order Cepstral Moment Normalization (HOCMN) for Robust Speech Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Higher Order Cepstral Moment Normalization (HOCMN) for Robust Speech Recognition Speaker: Chang-wen Hsu Advisor: Lin-shan Lee 2007/02/08

  2. Outline • Introduction • CMS/CMVN/HEQ • Higher Order Cepstral Moment Normalization (HOCMN) • Even order HOCMN • Odd order HOCMN • Cascade system • Fundamental principles • Experimental Results • Conclusions

  3. Introduction • Feature normalization in cepstral domain is widely used in robust speech recognition: • CMS: normalizing the first moment • CMVN: normalizing the first and second moments • Cepstrum Third-order Normalization (CTN): normalizing the first three moments (Electronics Letters, 1999) • HEQ: normalizing the full distribution (all order moments) • How about normalizing a few higher order moments only? • Higher order moments are more dominated by higher value samples • Normalizing only a few higher order moments may be good enough, while avoiding over-normalization

  4. progressively Time Introduction • Cepstral Normalization • CMS: • CMVN:

  5. Introduction • Histogram Equalization

  6. Higher Order Cepstral Moment Normalization • If the distribution of the cepstral coefficients can be assumed to be quasi-Gaussian: • Odd order moments can be normalized to zero • Even order moments can be normalized to some specific values • Define notation: • X(n): a certain cepstral coefficient of the n-th frame • X[k](n): with the k-th moment normalized • X[k,l](n): with both the k-th and l-th moments normalized • X[k,l,m](n): with the k-th, l-th and m-th moments normalized • HOCMN[k,l,m]: an operator normalizing the k-th, l-th and m-th moments • For example

  7. Cepstral Moment Normalization • Moment estimation: • Time average of MFCC parameters • Purpose: • For odd order L • For even order N

  8. Even order HOCMN • Only the moment for a single even order N can be normalized and CMS can always be performed in advance • Therefore, the new feature coefficients can be expressed as • Let the desired value of the N-th moment of the new feature coefficient be , that is

  9. Even order HOCMN • Aurora 2, clean condition training, word accuracy averaged over 0~20dB and all types of noise (sets A,B,C) CMVN=HOCMN[1,2]

  10. Acc. 82.40 to be normalized l=86 is best 82.00 81.60 …… X(n-3) X(n-2) X(n-1) X(n) X(n+1) X(n+2) X(n+3) …… 81.20 l 80.80 80.40 60 70 80 90 100 110 120 l [1,100] Even order HOCMN • Evaluation of the expectation value for the moments • Sample average over a reference interval • Full utterance • Moving window of l frames

  11. Experimental results • Aurora 2, clean condition training, word accuracy averaged over 0~20dB and all types of noise (sets A,B,C) CMVN (l=86) CMVN (full-utterance)

  12. Odd order HOCMN (1/3) • Besides the first moment (CMS), only another single moment of odd order L can be normalized in addition • The L-th HOCMN can be obtained from the (L-1)-th HOCMN (which is for an even number as discussed previously) • Then, the new feature coefficients can be expressed as “a” and “c” are to be solved

  13. Odd order HOCMN (2/3) • To solve “a” and “c” • The first moment is set to zero • The N-th moment is set to zero • After some mathematics and approximation

  14. Odd order HOCMN (3/3) • Because the formula for “a” above is only an approximation, a recursive solution can be obtained in about two iterations

  15. Cascade system • Cascading an odd order operator HOCMN[1,L] (L is an odd number) and an even order operator HOCMN[1,N] (N is an even number) can obtain an operator HOCMN[1,L,N]

  16. Experimental results • Aurora 2, clean condition training, word accuracy averaged over 0~20dB and all types of noise (sets A,B,C) CTN=HOCMN[1,2,3] CTN=HOCMN[1,2,3] CN (l=86) CMVN (l=86) CN CMVN

  17. Skewness and Kurtosis • Skewness • Third moment about the mean and normalized to the standard deviation • Pdf departure from symmetric • Positive/negative indicate skew to right/left • Zero indicate symmetric • Kurtosis • Fourth moment about the mean and normalized to the standard deviation • Peaked or “flat with tails of large size” as compared to standard Gaussian • “3” is the fourth moment of N(0,1) • Positive/negative indicate flatter/more peaked

  18. Skewness and Kurtosis • 1st-moment always normalized • Define: Generalized skewness of odd order L • L are not necessary 3 • Similar meaning as skewness (skew to right or left) except in the sense of L–th moment • Define: Generalized kurtosis of even order N • N are not necessary 4 • Similar meaning as kurtosis (peaked or flat) except in the sense of N–th moment

  19. Skewness and Kurtosis • Normalizing odd order moment is to constrain the pdf to be symmetric about the origin • Except in the sense of L-th moment • Normalizing even order moment is to constrain the pdf to be “equally flat with tails of equal size” • Except in the sense of N-th moment

  20. Generalized Moments • The order of normalized moments are not necessary integers • Generalized moment • Type 1: • Reduced to odd order moment when u is an odd integer L (ex: L=1 or 3) • Type 2: • Reduced to even order moment when u is an even integer N (ex: N=2 or 4) • HOCMN with non-integer moment orders

  21. Experimental Setup • Aurora2 database • Training: Clean condition training • Testing: Set A, B and C • Development: All from clean training data • 39-dimension feature coefficients • C0~C12 MFCC, Δ, Δ2 • Normalization performed on C0~C12

  22. Experimental Results • Higher order moments can derive more robust features • Normalizing only three orders of moments are better • than full distribution

  23. Experimental Results

  24. Experimental Results

  25. PDF Analysis Original C0 & C1 • HEQ • Over fitting to Gaussian • Loss original statistics • HOCMN • Fitting the generalized skewness and kurtosis • Retain more speech nature HEQ HOCMN

  26. Distance Analysis • Distance definition: • HOCMN can derive smaller distance between • clean and noisy speech • distance reduction has similar trend as error • rate reduction

  27. Experimental Results • Slight improvement for HOCMN with non-integer • order moments • Especially for lower SNR values • Other robust techniques can be combined with it

  28. Experimental Results

  29. Experimental Results • For multi-condition training: • HOCMN performs better than CMVN for all SNR values • Better than HEQ for higher SNR values

  30. Conclusions • We proposed a unified framework for higher moment order cepstral normalization • Normalization of higher moment order gives more robust features • Parameter set can be appropriately selected by development set • Skewness/kurtosis/distance analysis can further demonstrate the concepts of the normalization techniques

More Related