180 likes | 338 Views
Detection of Vowel Onset Point in Speech. S.R. Mahadeva Prasanna & Jinu Mariam Zachariah Department of Computer Science & Engineering Indian Institute of Technology Madras, India Email: {prasanna,jinu}@cs.iitm.ernet.in http://speech.cs.iitm.ernet.in. 1. Objective & Organization.
E N D
Detection of Vowel Onset Point in Speech S.R. Mahadeva Prasanna & Jinu Mariam Zachariah Department of Computer Science & Engineering Indian Institute of Technology Madras, India Email: {prasanna,jinu}@cs.iitm.ernet.in http://speech.cs.iitm.ernet.in 1
Objective & Organization • Significance of VOP for speech analysis • Issues in VOP detection • Acoustic-phonetic description of VOP • Acoustic cues for VOP detection • Detection of VOP in speech • Begin-end detection using VOP • Summary and conclusions 2
Significance of VOP for Speech Analysis • Syllabification • Begin-end detection • Speech recognition • Speech enhancement • Vowel/Non-vowel classification 3
Issues in VOP Detection • CVs with voiced consonants • Nasals, semivowels and aspirated sounds 4
Acoustic-Phonetic Description of VOP • Changes in excitation source and vocal tract system characteristics • C and V regions in terms of acoustic features Acoustic Cues for VOP Detection • Formant transition (Ftr(t)) • Epoch intervals (Ei(t)) • Strength of instants (Si(t)) • Itakura distance (Id(t)) • Ratio of signal energy to residual energy (Sr(t)) 5
VOP Detection in Isolated CVs Performance 7
Performance of VOP Detection Algorithm Clean Speech Degraded Speech 9
Summary & Conclusions • Automatic VOP detection algorithm using source features • Acoustic-phonetic description for VOP • Acoustic cues for VOP detection based on instants of significant excitation • Begin-end detection using the knowledge of VOP • Need for a method to combine spectral features with proposed source features for robust detection of VOP 11
Department of Computer Science & Engineering Indian Institute of Technology Madras, India Email: {prasanna,jinu}@cs.iitm.ernet.in http://speech.cs.iitm.ernet.in
ABSTRACT Sound units in many languages are syllabic in nature, and frequently used syllables are of consonant-vowel (CV) type. Vowel onset point (VOP) is an important event in CV units. Knowledge of VOPs helps in many applications such as speech recognition, speaker recognition, speech enhancement, begin-end detection, segmentation of speech into vowel/nonvowel-like units and finding duration of vowels. In this paper we describe parameters or features useful for manually identifying the VOPs for different types of CV units. An automatic algorithm is proposed for detecting VOPs in continuous speech, which is motivated by the nature of production and perception of speech. Speech signal is a result of exciting a time varying vocal tract system with time varying excitation. Changes in the source and system characteristics around the VOP are both useful for the detection of VOPs. In this paper we use the changes in the source characteristics for detecting the VOPs. The performance of the proposed algorithm is evaluated using 25 sentences for which a total of 236 VOPs have been identified manually. It is found that 216 VOPs have been detected within a resolution of +/- 30 ms. Compared to the energy-based approach, VOP-based begin-end detection has significantly improved the performance in the case of text-dependent speaker verification system. For telephone database of 32 speakers consisting of 480 genuine utterances and 1984 impostor utterances, the performance of the system has improved from an equal error rate (EER) of 5.9% to 2.6%.