150 likes | 275 Views
A Tone Recognition Framework for Continuous Mandarin Speech. Lei He, Jie Hao Toshiba (China) Research and Development Center INTERSPEECH 2006 - ICSLP. Hsiao- Tsung Hung. Introduction. LVCSR 結合聲調辨識 Embedded tone modeling: [MFCC + F0] Model the tone pattern separately. System Framework.
E N D
A Tone Recognition Framework for Continuous Mandarin Speech Lei He, JieHao Toshiba (China) Research and Development Center INTERSPEECH 2006 - ICSLP Hsiao-Tsung Hung
Introduction • LVCSR結合聲調辨識 • Embedded tone modeling: • [MFCC + F0] • Model the tone pattern separately
F0 detection • Normalized short-time autocorrelation function K. Hirose, H. Fujisaki, S. Seto, “A scheme for pitch extraction for speech using autocorrelation function with frame length proportional to the time lag”, Proc. ICASSP, Vol. I, pp. 149-152, 1992.
Subsection outlined features E(F0):average F0 value : movement of F0 value E(VL):average voicing level “the correlation coefficient of each frame is used to represent the voicing level .” *4 + duration = 13 (dimension) Base line
Contextual Features Expansion Describe co-articulation effects *6 + duration = 13 (dimension)
Phonetic category information • All phonetic units are clustered into 7 classes according to corresponding phonetic attributes. • Using ID as features. • Add 5-dimension features: [pre-Final + Initial + Final + next-Initial + next-Final]