A Hybrid Model of HMM and RBFN Model of Speech Recognition

A Hybrid Model of HMM and RBFN Model of Speech Recognition 길이만, 김수연,김성호, 원윤정,윤아림 한국과학기술원 응용수학전공

Automatic Speech Recognition • Message Encoding /Decoding

Hidden Markov Models • The Markov Generation Model

Hidden Markov Models • HMM is defined by : 1. A set S of Q states, , of a time-discrete Markov chain of order 1 2. An initial probability distribution of the states : 3. A transition probability distribution between states: 4. An emission probability distribution of the acoustic observations X within each state:

Hidden Markov Models • Major problems of HMMs – Trainig – Decoding • Solutions:– Baum/Welch algorithm– Viterbi algorithm

Hidden Markov Models • Advantages of standard HMMs – provide a natural and highly reliable way of recognizing speech for a wide range of applications – integrate well into systems incorporating both task syntax and semantics • Limitations of standard HMMs – non-discriminative training/decoding criterion – Arbitrary assumptions on the parametric form of probability distributions – High sensitivity to environmental conditions

Artificial Neural Networks • Nice Properties of ANN * Learning Capability from examples* Generalization ability* Non-parametric estimation • Limitations of ANN * Restricted to local decisions – generally used for classification of static input with no sequential processing* Not well-suited for dealing with time-varying Input patterns and segmentation of sequential inputs

Hybrid Models of HMM/ANN • ANNs that emulate HMMs • Connectionist probability estimation for continuous HMMs • Hybrids with "global optimization" • Connectionist Vector Quantizers for discrete HMMs • ANNs as acoustic front-ends for continuous HMMs

Hybrid Models of HMM/ANN 1. Initialization: – Initial segmentation of the training set – Labeling of the acoustic vectors with "0" or "1" ,according to the segmentation – ANN training via Back-Propagation (BP) or other algorithms 2. Iteration – New segmentation of training set according to Viterbi algorithm computed over ANN outputs – Labeling of the acoustic vectors with "0" or "1" – ANN retaining by BP

Proposed HMM/RBFN Model

Proposed HMM/RBFN Model • First Training • LBG clustering – Setting centers and variances of radial basis functions • RLS algorithm – Training weights – Target:

2. Second Training-LCM/GPD

Simulation • Database – TIMIT1Five class phoneme (C, L, N, S, V)Acoustic features: 26 dimension of MFCC features – TIMIT2Digit(0, 1, 2, …,9) Acoustic features: 16 dimension of ZCPA features

Simulation 2. Results – TIMIT1 Table 1: result of 5 class recognition

– TIMIT2 Table2: result of Digit recognition

Conclusion 1. Result – Non-parametric estimates: no a priori assumpitions on the form of the distributions – Better initialization than other hybrid system – Discriminative training – Improved performance over standard HMM 2. Further Works – Performance degration in noise environment – Clustering/Parameter Training – GPD is not stable

A Hybrid Model of HMM and RBFN Model of Speech Recognition

A Hybrid Model of HMM and RBFN Model of Speech Recognition

Presentation Transcript

Speech recognition using HMM

A Recognition Model for Speech Coding

Hidden Markov Model (HMM) - Tutorial

LINEAR DYNAMIC MODEL FOR CONTINUOUS SPEECH RECOGNITION

Speech Recognition and HMM Learning

Model-Based Diagnosis of Hybrid Systems

Hybrid School Model

A novel irregular voice model for HMM-based speech synthesis

A glimpsing model of speech perception

Initial HMM Model

A Survey of Boosting HMM Acoustic Model Training

A Hybrid Model of Distance Learning

Towards a Hybrid Model

LINEAR DYNAMIC MODEL FOR CONTINUOUS SPEECH RECOGNITION

Model-Based Fusion of Bone and Air Sensors for Speech Enhancement and Robust Speech Recognition

Hybrid equivalent model

A New Bigram-PLSA Language Model for Speech Recognition

LINEAR DYNAMIC MODEL FOR CONTINUOUS SPEECH RECOGNITION

Hidden Markov Model in Automatic Speech Recognition

Hybrid Model of Work-based Learning

A Hybrid Model of Distance Learning