1 / 16

Discriminatively Trained Region Dependent Feature Transforms for Speech Recognition

Discriminatively Trained Region Dependent Feature Transforms for Speech Recognition. Bing Zhang, Spyros Matsoukas, Richard Schwartz BBN Technologies ICASSP 2006. Presented by Shih-Hung Liu 2006/06/05. Outline. Introduction MPE-HLDA and fMPE reviewed Region dependent transform Corpus

vic
Download Presentation

Discriminatively Trained Region Dependent Feature Transforms for Speech Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Discriminatively Trained Region Dependent Feature Transforms for Speech Recognition Bing Zhang, Spyros Matsoukas, Richard Schwartz BBN Technologies ICASSP 2006 Presented by Shih-Hung Liu 2006/06/05

  2. Outline • Introduction • MPE-HLDA and fMPE reviewed • Region dependent transform • Corpus • Experimental results • Conclusions

  3. Introduction • Discriminative feature optimization has been a research focus in the last few years • Various works have been published, which share the idea of training a feature transform under discriminative criteria such as MPE, MMI, and MCE ,(MBRDT) • By using discriminative criteria, the feature optimization is better correlated with the reduction of recognition errors, hence offers better accuracy than standard feature transforms like LDA+MLLT or HLDA

  4. Introduction • In MPE-HLDA, a global linear projection is optimized, selecting compact features from concatenated cepstral coefficients across several frames (long span features) • Being a linear projection, it offers moderate gains over LDA+MLLT but is quite limited

  5. Introduction • Feature space MPE (fMPE) is a nonlinear transform also trained using MPE criterion • A global Gaussian mixture model (GMM) is trained, and the posterior vectors of those Gaussians are projected into a lower dimensional space, in order to correct some predefined features • The idea of using Gaussian posteriors in fMPE is not new. A similar algorithm called SPLICE has been used previously for noise compensation

  6. Introduction • In this work, we combine the idea of piece-wise transforms, as in fMPE and SPLICE, together with the idea of using long span feature projections, as in MPE-HLDA, forming a kind of more general transformation, which is referred to as region dependent transform (RDT) • In RDT, either a linear or a nonlinear feature transform can be used in each region • As a special case of RDT, a linear projection of long span features is used for each region. It is referred to as region dependent linear transform (RDLT)

  7. MPE-HLDA Reviewed • Both MPE-HLDA and fMPE aims at optimizing the objective function of MPE. • The feature transform in MPE-HLDA is a global linear projection: where is long span feature obtained by concatenating several frames of cepstral features from to

  8. fMPE Reviewed • In fMPE, the feature transform is where is a projection matrix, and is an N-dimensional vector of Gaussian posterior probabilities, computed using a GMM trained in the same space of , is some low dimensional feature vector Typically is formed by linearly projecting

  9. fMPE Reviewed • fMPE feature transform can be rewritten as where denotes the ith column of M, and is theith element of • fMPE can be viewed as weighted sum of some feature vectors, each obtained by shifting by a constant term • Since the posterior tells which Gaussian the frame is close to, fMPE can be viewed as a region dependent feature correction function. Here the division of regions in the acoustic space is performed by the GMM

  10. Region Dependent Transform • It seems at a first glance that two very different function are used in MPE-HLDA and fMPE • We rewrite MPE-HLDA in a similar form as fMPE • From above, it is natural to come up with a generalized form of region dependent linear transform (RDLT) in which both linear projection and bias are specific to region (or equivalently to Gaussian )

  11. Region Dependent Transform • Conceptually, can be further generalized by removing the linear constraint, which leads to the general region dependent transform as • In this paper, we will only focus on the regionally linear transform where is a vector-to-vector mapping function whose parameters depend on region

  12. Region Dependent Transform • As variations of RDLT, one can think of using parameter tying for , since it is possible that fewer distinctive projections are really needed than biases • The tying could be implemented by clustering the Gaussians in the GMM into fewer groups, as we normally do for fast Gaussian computation • Within each group, parameters of the projections could be shared. The parameter tied RDLT (tRDLT) can be express as

  13. Corpus • Training corpus • 2300hrs EARS R04 CTS • 370hrs Switchboard and Callhome data • 1970hrs Fisher data • Testing data • RT03 evaluation set (Eval03) • 3hrs Switchboard-II and 3hrs Fisher data • RT04 development set (Dev04) • 3hrs Fisher data

  14. Experimental Results • Projections vs. Offsets MPE-HLDA fMPE without context expansion

  15. Experimental Results

  16. Conclusions • We have introduced the concept of region dependent transform by extending MPE-HLDA with the idea of dividing the acoustic space into regions and having a feature transform for each region • Showing that both fMPE and MPE-HLDA are special cases of the region dependent linear transform

More Related