Tone Recognition With Fractionized Models and Outlined Features

Tone Recognition With Fractionized Models and Outlined Features Ye Tian, Jian-Lai Zhou, Min Chu, Eric Chang ICASSP 2004 Hsiao-TsungHung Department of Computer Science and Information Engineering National Taiwan Normal University

Outline • Introduction • Features • Detailed features • Outlined features • Experiments and analysis • Tone Modeling • Experiments and analysis • Conclusions

Introduction • 2 questions • Is the detailed information of F0 curve useful for tone discrimination in continuous speech? • Are phoneme-independent tone models sufficient for continuous speech recognition?

Detailed features • Detailed features: Using the entire F0 curve. • Observation vector is • If the phoneme has totally N frames, the number of total parameters used for tone recognition is 2*N.

Outlined features • To reduce the number of parameters and improve the robustness. • Curve fitting features • Subsection Outlined features

Curve fitting features • First-order • Second-order

Subsection Outlined features • The F0 curve of the entire phoneme is divided into several subsections and each subsection is represented by certain parameters. • Extract parameters for each subsection • 1.subsection slop and intercept • 2.subsection and (Assume that time frames belong to the subsection k.)

Y F0 X X={0,1,…,//frame Y=

Subsection Outlined features 1.subsection slop and intercept

Subsection Outlined features 2.subsection and

Experiments and analysis 1.Main value and direction are the most important characteristics. 2.Detailed information is useless for tone discrimination.

Tone Modeling • One-tone-one-model tone models(5) • Monophone-dependent tone models(54) The same tone in different tonal phonemes is different modeled. • Triphone-dependent tone models(12824)

Experiments and analysis • Feature vector :

Conclusions • Using fractionized models and outlined features for tone recognition. • Outlined features can reduce the interference caused by co-articulation effect, syllable stress, and sentence intonation.

Tone Recognition With Fractionized Models and Outlined Features

Tone Recognition With Fractionized Models and Outlined Features

Presentation Transcript

Learning with Probabilistic Features for Improved Pipeline Models

Dialing Tone Recognition

Bilinear models for action and identity recognition

Object Recognition with Informative Features and Linear Classification

Bag-of-features models

Context in Multilingual Tone and Pitch Accent Recognition

Bag-of-features models

Mandarin Tone Recognition using Affine-Invariant Prosodic Features and Tone Posteriorgram

Context and Learning in Multilingual Tone and Pitch Accent Recognition

Bag-of-features models

Recognition Using SIFT Features

Document Recognition Without Strong Models

Context in Multilingual Tone and Pitch Accent Recognition

Speech Recognition and Hidden Markov Models

Named-Entity Recognition with Character-Level Models

Object Recognition with Invariant Features

Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features

Features for handwriting recognition

Object Recognition with Invariant Features

Recognition Using SIFT Features

Object Recognition with Informative Features and Linear Classification

IntegraPower Features and Models