200 likes | 377 Views
Survey of INTERSPEECH 2013. Reporter : Yi-Ting Wang 2013/09/10. Outline. Exemplar-based Individuality-Preserving Voice Conversion for Articulation Disorders in Noisy Environments Robust Speech Enhancement Techniques for ASR in Non-stationary Noise and Dynamic Environments
E N D
Survey of INTERSPEECH 2013 Reporter: Yi-Ting Wang 2013/09/10
Outline • Exemplar-based Individuality-Preserving Voice Conversion for Articulation Disorders in Noisy Environments • Robust Speech Enhancement Techniques for ASR in Non-stationary Noise and Dynamic Environments • NMF-base Temporal Feature Integration for Acoustic Event Classificaion
Exemplar-based Individuality-Preserving Voice Conversion for Articulation Disorders in Noisy Environments Ryo AIHARA, Ryoichi TAKASHIMA, Tetsuya TAKIGUCHI, Yasuo ARIKI Graduate School of System Informatics, Kobe University, Japen
Introduction • We present in this paper a noise robust voice conversion(VC) method for a person with an articulation disorder resulting from athetoid cerebral pslsy. • Exemplar-based spectral conversion using NMF is applied to a voice with an articulation disorder in real noisy environments. • NMF is a well-known approach for source separation and speech enhancement. • Poorly articulated noisy speech -> clean articulation
Experimental Results • ATR Japanese speech database.
Conclusions • We proposed a noise robust spectral conversion method based on NMF for a voice with an articulation disorder. • Our VC method can improve the listening intelligibility of words uttered by a person with an articulation disorder in noisy environments.
Robust Speech Enhancement Techniques for ASR in Non-stationary Noise and Dynamic Environments Gang Liu, DimitriosDimitriadis, Enrico Bocchieri Center for Robust Speech Systems, University of Texas at Dallas
Introduction • In the current ASR systems the presence of competing speakers greatly degrades the recognition performance. • Furthermore, speakers are, most often, not standing still while speaking. • We use Time Differences of Arrival(TDOA) estimation, multi-channel Wiener Filtering, NMF, multi-condition training, and robust feature extraction.
Proposed cascaded system • The problem of source localization/separation is often addressed by the TDOA estimation.
Experiment and results • NMF provides the largest boost, due to the suppression of the non-stationary interfering signals.
Conclusion • We propose a cascaded system for speech recognition dealing with non-stationary noise in reverberated environments. • The proposed system offers an average of 50% and 45% in relative improvements for the above mentioned two scenarios.
NMF-base Temporal Feature Integration for Acoustic Event Classificaion Jimmy Ludena-Choez, Ascension Gallardo-Antolin Dep. of Signal Theory and Communications, Universidad Carlos III de Madrid, Avda de la Universidad 30,28911 – Leganes(Madrid), Spain
Introduction • This paper propose a new front-end for Acoustic Event Classification tasks(AEC) based on the combination of the temporal feature integration technique called Filter Bank Coefficients(FC) and Non-Negative Matrix Factorization. • FC allows to capture the dynamic structure in the short time features. • We present an unsupervised method based on NMF for the design of a filter bank more suitable for AEC.
Experiments and results • Here, use the NMF use KL divergence.
Conclusions • We have presented a new front-end for AEC based on the combination of FC features and NMF. • NMF is used for the unsupervised learning of the filter bank which captures the most relevant temporal behavior in the short-time features. • Low modulation frequencies are more important than the high ones for distinguishing between different acoustic events. • The experiments have shown that the features obtained with this method achieve significant improvements in the classification performance of a SVM-based AEC system in comparison with the baseline FC parameters.