90 likes | 272 Views
Mel-spectrum to Mel-cepstrum Computation A Speech Recognition p resentati on October 1 2003. Ji Gu J.Gu@umail.LeidenUniv.nl. Mel-spectrum to Mel-cepstrum Computation. Now we have known :
E N D
Mel-spectrum to Mel-cepstrum ComputationA Speech RecognitionpresentationOctober 1 2003 Ji Gu J.Gu@umail.LeidenUniv.nl
Mel-spectrum to Mel-cepstrum Computation Now we have known: • The FFT processing step converts each frame of N samples from the time domain into the frequency domain. • The result of the Mel-spectrum computation is:
Mel-spectrum to Mel-cepstrum Computation To compute Mel-cepstrum: • We convert the log Mel-spectrum back to time domain using the Discrete Cosine Transform (DCT). (Because the Mel-spectrum coefficients and their logarithm are real numbers) • The result obtained is called the Mel Frequency Cepstrum Coefficients (MFCC).
Mel-spectrum to Mel-cepstrum Computation Therefore : A DCT is applied to the natural logarithm of the Mel-spectrum to obtain the Mel-cepstrum,c[n] as: C is the number of the cepstral coefficients
Mel-spectrum to Mel-cepstrum Computation In SPHINX III Signal Processing Front End Specification • First, the Cosine section of c[n] is computed: int32 fe_compute_melcosine(melfb_t *MEL_FB) { float period, freq; int32 i,j; period = (float)2*MEL_FB->num_filters; if ((MEL_FB->mel_cosine = (float **) fe_create_2d(MEL_FB->num_cepstra,MEL_FB->num_filters, sizeof(float)))==NULL){ fprintf(stderr,"memory alloc failed in fe_compute_melcosine()\n...exiting\n"); exit(0); }
Mel-spectrum to Mel-cepstrum Computation for (i=0; i<MEL_FB->num_cepstra; i++) { freq = 2*(float)M_PI*(float)i/period; for (j=0;j< MEL_FB->num_filters;j++) MEL_FB->mel_cosine[i][j] = (float)cos((double)(freq*(j+0.5))); } return(0); } • Second, a Cosine transform of the Logarithm of the Mel-spectrum:
Mel-spectrum to Mel-cepstrum Computation void fe_mel_cep(fe_t *FE, double *mfspec, double *mfcep) { int32 i,j; /* static int first_run=1; */ /* unreferenced variable */ int32 period; float beta; period = FE->MEL_FB->num_filters; for (i=0;i<FE->MEL_FB->num_filters; ++i) { if (mfspec[i]>0) mfspec[i] = log(mfspec[i]); else mfspec[i] = -1.0e+5; }
Mel-spectrum to Mel-cepstrum Computation for (i=0; i< FE->NUM_CEPSTRA; ++i){ mfcep[i] = 0; for (j=0;j<FE->MEL_FB->num_filters; j++){ if (j==0) beta = 0.5; else beta = 1.0; mfcep[i] += beta*mfspec[j]*FE->MEL_FB->mel_cosine[i][j]; } mfcep[i] /= (float)period; } return; }
Mel-spectrum to Mel-cepstrum Computation By applying the procedure described above: • For each speech frame, a set of mel-frequency cepstrum coefficients(MFCC) is computed. • This set of coefficients is called an acoustic vector which represents the phonetically important characteristics of speech and is very useful for further analysis and processing in Speech Recognition. End