1 / 32

HIWIRE MEETING Torino, March 9-10, 2006

HIWIRE MEETING Torino, March 9-10, 2006. José C. Segura, Javier Ramírez. Schedule. HIWIRE database evaluations New results: HEQ and PEQ Non-linear feature normalization Using temporal redundancy HEQ integration in Loquendo platform Recursive estimation of the equalization function

candra
Download Presentation

HIWIRE MEETING Torino, March 9-10, 2006

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HIWIRE MEETINGTorino, March 9-10, 2006 José C. Segura, Javier Ramírez

  2. Schedule • HIWIRE database evaluations • New results: HEQ and PEQ • Non-linear feature normalization • Using temporal redundancy • HEQ integration in Loquendo platform • Recursive estimation of the equalization function • New improvements in robust VAD • Bispectrum-based VAD • SVM-enabled VAD

  3. HIWIRE database evaluations

  4. Schedule • HIWIRE database evaluations • New results: HEQ and PEQ • Non-linear feature normalization • Using temporal redundancy • HEQ integration in Loquendo platform • Recursive estimation of the equalization function • New improvements in robust VAD • Bispectrum-based VAD • SVM-enabled VAD

  5. AURORA4 AURORA2 (clean test) Temporal redundancy in HEQ • Enhance the normalization adding a linear transformation to restore temporal correlations • Each equalized cepstral coefficient is time-filtered with an ARMA filter that restores the autocorrelation of clean data

  6. Actually implemented HIGH MISMATCH SEGMENTAL New proposal SENTENCE-BY-SENTENCE RECURSIVE HEQ integration in Loquendo platform

  7. HEQ integration (recursive estimation) (1) • Actual approach: Gaussian HEQ using ECDF • Using quantiles

  8. HEQ integration (recursive estimation) (2) • Equalization by linear interpolation Averaged over training data From actual utterance • Mapping correspondingquantiles

  9. HEQ integration (recursive estimation) (3)

  10. HEQ integration (recursive estimation) (4) • Utterances are equalized WITHOUT delay • Quantiles are updated AFTER the equalization

  11. HIWIRE MEETINGTorino, March 9-10, 2006 José C. Segura, Javier Ramírez

  12. Schedule • HIWIRE database evaluations • New results: HEQ and PEQ • Non-linear feature normalization • Using temporal redundancy • HEQ integration in Loquendo platform • Recursive estimation of the equalization function • New improvements in robust VAD • Bispectrum-based VAD • SVM-enabled VAD

  13. Bispectrum-based VAD (1) • Motivations: • Ability of HOS methods to detect signals in noise • Knowledge of the input processes (Gaussian) • Issues to be addressed: • Computationally expensive • Variance of bispectrum estimators much higher than that of power spectral estimators (identical data record size) • Solution: Integrated bispectrum • J. K. Tugnait, “Detection of non-Gaussian signals using integrated polyspectrum,” IEEE Trans. on Signal Processing, vol. 42, no. 11, pp. 3137–3149, 1994.

  14. Bispectrum-based VAD (2) • Definitions: Let x(t) be a discrete-time signal • Bispectrum: • Third order cumulants: • Inverse transform:

  15. Bispectrum-based VAD (3) Noise only Noise + speech

  16. Bispectrum-based VAD (4) • Integrated bispectrum (IBI): • Cross-spectrum Syx() • Bispectrum Inverse transform: • Bispectrum – Cross spectrum: i= 0

  17. Bispectrum-based VAD (5) • Integrated bispectrum (IBI): • Defined as a cross spectrum between the signal and its square, and therefore, it is a function of a single frequency variable • Benefits: • Less computational cost • computed as a cross spectrum • Variance of the same order of the power spectrum estimator • Properties • For Gaussian processes: • Bispectrum is zero • Integrated bispectrum is zero as well

  18. Bispectrum-based VAD (6) • Two alternatives explored for formulating the decision rule: • Estimation by block averaging (BA): • MO-LRT: • Given a set of N= 2m+1 consecutive observations:

  19. LRT evaluation IBI Gaussian Model Bispectrum-based VAD (7) • Variances • Defined in terms of • Sss (clean speech power spectrum) • Snn (noise power spectrum)

  20. Bispectrum-based VAD (8) • Denoising: 2nd WF stage 1st WF stage 2nd WF design Smoothed spectral subtraction 1st WF design 1-frame delay

  21. Bispectrum VAD Analysis (1) • MO-LRT VAD

  22. Bispectrum-based VAD results (2)

  23. Bispectrum-based VAD results (3)

  24. Bispectrum-based VAD results (4) WF: Wiener filtering FD : Frame-dropping

  25. SVM-enabled VAD (1) • Motivation: • Ability of SVMs for learning from experimental data • SVMs enable defining a function: using training data: • Classify unseen examples (x, y) • Statistical learning theory restricts the class of functions the learning machine can implement.

  26. SVM-enabled VAD (2) • Hyperplane classifiers: • Training: w and b define maximal margin hyperplane • Kernels:

  27. SVM-enabled VAD (3)

  28. SVM-enabled VAD (4) • Feature extraction: • Training:

  29. SVM-enabled VAD (5) • Feature extraction: • Decision function • 2-band features

  30. SVM-enabled VAD (6) • Analysis: • 4 subbands • Noise reduction • Improvements: • Contextual speech features • Better results without noise reduction

  31. Dissemination (VAD) • Integrated bispectrum: • J.M. Górriz, J. Ramírez, C. G. Puntonet, J.C. Segura, “Generalized-LRT based voice activity detector”,IEEE Signal Processing Letters, 2006. • J. Ramírez , J.M. Górriz, J. C. Segura, C. G. Puntonet, A. Rubio, “Speech/Non-speech Discrimination based on Contextual Information Integrated Bispectrum LRT”,IEEE Signal Processing Letters, 2006. • J.M. Górriz, J. Ramírez, J. C. Segura, C. G. Puntonet, L. García, “Effective Speech/Pause Discrimination Using an Integrated Bispectrum Likelihood Ratio Test” , ICASSP 2006. • SVM VAD: • J. Ramírez, P. Yélamos, J.M. Górriz, J.C. Segura. “SVM-based Speech Endpoint Detection Using Contextual Speech Features”,IEE Electronics Letters 2006. • J. Ramírez, P. Yélamos, J.M. Górriz, C.G. Puntonet, J.C. Segura. “SVM-enabled Voice Activity Detection”,ISNN 2006. • P. Yélamos, J. Ramírez, J.M. Górriz, C.G. Puntonet, J.C. Segura, “Speech Event Detection Using Support Vector Machines”,ICCS 2006.

  32. HIWIRE MEETINGAthens, November 3-4, 2005 José C. Segura, Javier Ramírez

More Related