1 / 21

Duration modeling for speech recognition

Duration modeling for speech recognition. Presented for BBN Dr . Andrey Nikiforov Department of Applied Mathematics and Statistics State University of New York at Stony Brook. Additional topics. Computational and modeling issues improving the performance of speech recognition algorithms

lesa
Download Presentation

Duration modeling for speech recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Duration modeling for speech recognition Presented for BBN Dr. Andrey Nikiforov Department of Applied Mathematics and Statistics State University of New York at Stony Brook

  2. Additional topics Computational and modeling issues improving the performance of speech recognition algorithms • Partial classification techniques • Tree-dependence covariance models in HMM • Fast search and computations for codebooks • Interpolation for acoustic space

  3. State duration in HMM

  4. Duration distributions

  5. From …

  6. … to

  7. Progressive model

  8. Time calculation A B t+1 t

  9. Time calculation (continued) A B t+1 t

  10. Probability calculations: from …

  11. …to

  12. Hazard function

  13. Hazard function estimation

  14. “Nonparametric estimate”

  15. “Trajectories”

  16. State duration correction (Fant et al., 1991)

  17. Word duration

  18. State duration correction

  19. State duration correction (continued)

  20. Conclusions • Representation of duration distribution via the hazard function is simple, effective and comfortable for programming • Speech recognition errors dropped by 20-25% in different tasks •  Pure time spent in Viterbi search or full probability calculation increased in average by 20% compared to the conventional HMM (almost completely compensated by the reduction of computations due to more adequate modeling)

  21. Partial classification techniques for speech recognition • Helps to create structure in speech HMMs • Useful in codebook(s) estimation • Initial estimates for HMMs and codebooks • More accurate estimates

More Related