140 likes | 284 Views
State Tying for Acoustic Modeling. S.J. Young, J.J. Odell, P.C. Woodland. Present by Hsu Ting-Wei 2007.06.11. Making tied-state triphone HMM Step1 : Making monophone HMM Step2: Making triphone from monophone Step3: Making tied-state triphones
E N D
State Tying for Acoustic Modeling S.J. Young, J.J. Odell, P.C. Woodland Present by Hsu Ting-Wei 2007.06.11
Making tied-state triphone HMM • Step1 : Making monophone HMM • Step2: Making triphone from monophone • Step3: Making tied-state triphones • Step4: Splitting tied-state triphone mixtures
Step1 : Making monophone HMM Signals Monophone HMM (proto type) aa ax Features … + + Transcriptions Re-estimate New monophone HMM aa ax …
New Triphone HMM k-aa+g l-ax+m k-aa+b … Step2: Making triphone from monophone New monophone HMM aa ax … Triphone list Transcriptions Initial Triphone HMM (Center phone) k-aa+g l-ax+m k-aa+b Features … Transcriptions + + Re-estimate
New Triphone HMM k-aa+g l-ax+m k-aa+b … Step3: Making tied-state triphones State tying Tied-state Triphone HMM k-aa+g l-ax+m k-aa+b …
Tied-state Triphone HMM 2 k-aa+g l-ax+m k-aa+b … Step4: Splitting tied-state triphone mixtures Tied-state Triphone HMM k-aa+g l-ax+m k-aa+b … Split mixtures Features + + Transcriptions Re-estimate New Tied-state Triphone HMM 2 k-aa+g l-ax+m k-aa+b …
Tying • The single biggest problem in building context-dependent HMM systems is always data insufficiency. • For continuous density systems, this balance is achieved by tying parameters together. • Model-based sharing • In the past, traditional methods of dealing with these problems involve sharing models across differing contexts to form so-called generalized triphones and using a posteriori smoothing techniques. • Smoothing triphones by biphones and monophones • These will be rather too broad when large training sets are used. • Making tied-state triphones • Data-Driven Clustering • Tree-Based Clustering
Data-Driven Clustering • Bottom-up method ↑ • For single Gaussians, a weighted Euclidean distance between the means is used and for tied-mixture systems a Euclidean distance between the mixture weights is used. • Algorithm: New Triphone HMM k-aa+b k-aa+g l-ax+m … State tying Tied-state Triphone HMM k-aa+g l-ax+m k-aa+b …
Data-Driven Clustering(cont.) • Here g(i,j) is the inter-group distance between clusters i and j defined as the maximum distance between any state in cluster i and any state in cluster j. • Single mixture Gaussians use • One limitation is that it does not deal with triphones for which there are no examples in the training data.
Tree-Based Clustering • Top-down method ↓ • One tree is constructed for each state of each phone to cluster all of the corresponding states of all of the associated triphones. • One of the advantages of using decision tree clustering is that it allows previously unseen triphones to be synthesised. New Triphone HMM k-aa+b k-aa+g l-ax+m … State tying Tied-state Triphone HMM k-aa+g l-ax+m k-aa+b …
Tree-Based Clustering (cont.) • Steps : • Each set of states is pooled to form a single cluster. • Each question in the question set loaded by the QS commands is used to split the pool into two sets. • The use of two sets rather than one, allows the log likelihood of the training data to be increased and the question which maximizes this increase is selected for the first branch of the tree. • The process is then repeated until • The increase in log likelihood achievable by any question at any node is less than the threshold • All states in the same leaf node are then tied
Reference : HTK –HHEd (cont.) • Tree-Based Clustering: • RO: outlier threshold • TR: trace flag in HTK • QS: questions • TB: decision tree clustering of states • AU: synthesis new unseen triphones • CO: compact a set of HMMs • ST: save tree
Cluster state[4] of phone /aa/ s-aa+t t-aa+n s-aa+n etc … L = Class-Stop ? y n R = Nasal ? L = Nasal ? n y y n R = Glide? y n