1 / 21

Posterior Probability of confidence measure

Posterior Probability of confidence measure. Reporter : CHEN TZAN HWEI. Reference . [1] F. K. Soong and W. K. Lo, “ GENERALIZED POSTERIOR PROBABILITY FOR MINIMUM ERROR VERIFICATION OF RECOGNIZED SENTENCES ” , ICASSP 2005

makayla
Download Presentation

Posterior Probability of confidence measure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Posterior Probability of confidence measure Reporter : CHEN TZAN HWEI

  2. Reference • [1] F. K. Soong and W. K. Lo, “GENERALIZED POSTERIOR PROBABILITY FOR MINIMUM ERROR VERIFICATION OF RECOGNIZED SENTENCES”, ICASSP 2005 • [2] J. Razik, O. Mella, D. Fohr and J.-. Haton, “Local Word Confidence Measure Using Word Graph and N-Best List”, EuroSpeech 2005 NTNU SPEECH LAB

  3. Introduction (1/2) • Application of confidence measure • A speech translation system can put more weight on reliable words. • A spoken dialogue system to decide whether to prompt whether to prompt the user to speak the whole utterance again, or to confirm the uncertain part only. NTNU SPEECH LAB

  4. Introduction (2/2) • Approaches proposed for measuring confidence of speech recognition output • Feature based : try to assess the confidence according to selected features (e.g. word duration, acoustic score etc.) • Explicit (extra) model based : employing a candidate model together with some competitor models. • Posterior probability based : trying to estimate the posterior probabilities of a recognized entity. NTNU SPEECH LAB

  5. Generalized Word Posterior Probability (1/5) • In MAP-based speech recognition, the best recognized word string : • The word posterior probability (WPP) NTNU SPEECH LAB

  6. Generalized Word Posterior Probability (2/5) • Three issue are addressed and the WPP is generalized • Reduced search space • Relax time registration • Optimal acoustic and language model weight NTNU SPEECH LAB

  7. Generalized Word Posterior Probability (3/5) • Reduced search space : in LVCSR, the search space is always pruning to make the search tractable. The reduced search space can also be conveniently used when computing the WPP • Relaxed time registration : the starting and ending time of a word in LVCSR output can be affected by various factor NTNU SPEECH LAB

  8. Generalized Word Posterior Probability (4/5) • Optimal acoustic and language model weights • Difference in the dynamic range • Difference in the frequency of computation • Independence assumption • Reduced search space NTNU SPEECH LAB

  9. Generalized Word Posterior Probability (5/5) • The Generalized WPP • The Generalized UPP NTNU SPEECH LAB

  10. Experiment (1/4) • Corpus : Chinese Basic Travel Expression Corpus • Feature : 12 MFCC, 12 delta MFCC and delta power. • The word recognition accuracy is 91% NTNU SPEECH LAB

  11. Experiment (2/4) • Evaluation measure • Baseline was obtained NTNU SPEECH LAB

  12. Experiment (3/4) Figure 1. Total error surfaces for word verification using GWPP Figure 2. total errors surfaces for utterance verification using GUPP NTNU SPEECH LAB

  13. Experiment (4/4) Figure 3. total errors surfaces for utterance verification by using the product of GWPPs Figure 4. CER at word and utterance levels NTNU SPEECH LAB

  14. Local word confidence measure – unigram based measure • The first measure uses only local and simple information • The constraint is too tied. So , we relax the timing constraint : NTNU SPEECH LAB

  15. Local word confidence measure – bigram based measure (1/2) • Bigram information with previous word. • Bigram information with previous word and next word NTNU SPEECH LAB

  16. Local word confidence measure – bigram based measure (2/2) • Using all possible previous or next word • may be too vast • Disturb the confidence measure with words that will never appear in any alternate sentence • In Eq. 8, Eq. 9 ,to calculate the bigram only with previous and next word belonging to the N-Best list NTNU SPEECH LAB

  17. Experiment (1/4) • Corpus : French broadcast news • 56 minutes(10651 words) are used to tune parameter • 53 minutes(10841 words) for the test. • average recognition rate for both corpora is 70.9% • Evaluation metric : equal error rate NTNU SPEECH LAB

  18. Experiment (2/4) • Relaxation rate : for an hypothesis word , we take into account words • Beginning time = • Ending time = • Length = Table 1: Results using unigram score NTNU SPEECH LAB

  19. Experiment (3/4) Table 2: Evolution of score with different scale rates (unigram measure, relaxation rate = 0.2). Table 3: Results using previous word bigram NTNU SPEECH LAB

  20. Experiment (4/4) Table 4: Results using previous and next word bigram Table 5: Results using previous word bigram with N-Best list NTNU SPEECH LAB

  21. Summary • Local word CM can be computed directly in first pass decoding. • It may be used in CM-based pruning? NTNU SPEECH LAB

More Related