1 / 12

Analysis of the Effects of Train noise on Recognition Rate using Formants and MFCC

Analysis of the Effects of Train noise on Recognition Rate using Formants and MFCC. Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 28 January, 2004. Contents. The effect of noise on LP-Model Poles Formant Extraction using LP-Model of speech

leone
Download Presentation

Analysis of the Effects of Train noise on Recognition Rate using Formants and MFCC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analysis of the Effects of Train noise on Recognition Rate using Formants and MFCC Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 28 January, 2004

  2. Contents The effect of noise on LP-Model Poles Formant Extraction using LP-Model of speech Recognition: Formants Vs. MFCC Features The effect of Maximum-Normalization and Mean Subtraction Static Vs. Dynamic Features

  3. Histogram of Pole Frequencies for Different phonemesMale Speaker – Train Noise SNR = 0

  4. Signal Pre-Processing Save the Formants, move to the next segment and repeat the procedure until the end of signal is reached. Windowing Yes Do poles meet Conditions? LP-Modelling and LP-Pole Extraction no Increase LP Order Formant Extraction Using LP-Model Poles • Maximum BW of formants • Limited frequency range • Fixed number of formants • Candidate Sets • Distant measure • Procedure in consonants

  5. Using LP Formants as features for recognition • In addition to the Frequency of poles their Band widths and Magnitudes are used as well • The HMM models are trained on mono-phones.

  6. Recognition ResultsFormants Vs. MFCC • MFCC Features contain C0 , Delta and Delta-Delta Features • Appended Features are vectors of MFCC appended to formants (length=75)

  7. In Maximum Normalizing each row is divided by the maximum absolute value of that particular row. In Mean Subtraction the mean of each row is subtracted so that the mean of each row will be set to zero. Combining these two, first the features are mean subtracted, then maximum normalized. Maximum Normalizing and Mean Subtracting the features

  8. Recognition ResultsMFCC Vs. Mean Subtracted Max Normalized MFCC With C0 • C0 is badly affected by noise.

  9. Recognition ResultsMFCC Vs. Mean Subtracted Max Normalized MFCCWithout C0 • The effect of noise on C0 can be compensated to some extents by Normalizing the features

  10. Recognition ResultsFormants Vs. Mean Subtracted Max Normalized Formants • Normalization increases the Recognition rate 10% in noisy conditions

  11. MFCC - Dynamic Vs. ‘Static’ Features Dynamic Values are Delta and Acceleration Values ‘Static’ Values are the Actual Values

  12. Formants - Dynamic Vs. ‘Static’ Features Dynamic Values are Delta and Acceleration Values ‘Static’ Values are the Actual Values

More Related