620 likes | 740 Views
Confidence Measures for Automatic Speech Recognition. National Taiwan Normal University Spoken Language Processing Lab. Advisor : Hsin-Min Wang Berlin Chen. Presented by Tzan-Hwei Chen. Outline. Introduction The category of estimation methods of confidence measure (CM)
E N D
Confidence Measures for Automatic Speech Recognition National Taiwan Normal University Spoken Language Processing Lab Advisor : Hsin-Min Wang Berlin Chen Presented by Tzan-Hwei Chen
Outline • Introduction • The category of estimation methods of confidence measure (CM) • Featured based • Posterior probability based • Explicit model based • Incorporation of high-level information for CM* • The application of CM to improve speech recognition • Summary
Introduction (1/9) • It is extremely important to be able to make an appropriate and reliable judgement based on the error-prone ASR result. • Researchers have proposed to compute a score (preferably 0~1), called confidence measure (CM), to indicate reliability of any recognition decision made by an ASR system.
Introduction (2/9) Some application of CM Confidence Measure Lexicon 1 2 Feature extraction Decoding Verification feature vector recognized word sequence speech signal 臺北 到 魚籃 Acoustic model Language model 1.臺北到魚籃 2.臺北到宜蘭
Introduction (3/9) • First of all, we can backtrack some early research on CM to rejection in word-spotting systems. • Other early CM-related works lie in automatic detection of new words in LVCSR. • From the past few years, the CM has been applied to more and more research areas, e.g., • To improve speech recognition • The algorithm about look-head in LVCSR • To guide the system to perform unsupervised learning • …
Introduction (4/9) • The general procedure of CM for verification Predefined threshold Recognized units Confidence estimation Confidence of unit judgment > threshold < threshold rejection acceptance
魚籃 宜蘭 宜蘭 魚籃 宜蘭 宜蘭 宜蘭 宜蘭 ref hyp hyp ref hyp ref ref hyp Introduction (5/9) • Four situations when judging a hypothesis Accept Correct acceptance reject Correct rejection reject false rejection Accept false acceptance
Introduction (6/9) • The evaluation metric : • Confidence error rate : FA CA FR CA FA 三民 候選人 通過 審查 了 hyp 有 三名 候選人 通過 審查 ref
Introduction (7/9) • The evaluation metric : • Confidence error rate : FA CA CA CA FA 三民 候選人 通過 審查 了 hyp 有 三名 候選人 通過 審查 ref
Introduction (8/9) • The evaluation metric (cont): • Receiver operator characteristics (ROC) curve :simply contains a plot of the false acceptance rate over the detection rate.
Introduction (9/9) • All methods proposed for computing CMs can be roughly classified into three major categories [7]: • Feature based • Posterior probability based • Explicit model based (utterance verification, UV) • Incorporation of high-level information for CM*
Feature-based confidence measure (1/8) • The feature can be collected during the decoding procedure and may include acoustic, language and syntactic information • Any feature can be called a predictor if its p.d.f. of correctly recognized words is clearly distinct from that of misrecognized words misrecognized word correctly recognized word
Feature-based confidence measure (2/8) • Some common predictor features • Pure normalized likelihood score related : acoustic score per frame. • N-best related : count in the N-best list, N-best homogeneity score • Duration related : word duration divided by its number of phones
Feature-based confidence measure (3/8) • Some common predictor features (cont) • Hypothesis density : 三名 候選人 三名 有 三名 通過 候選人 由 結果 沒有 審查 靜音 沒有 候選人 沒有 審查 建國 候選人 通過 又 候選人 三名
今天 天氣 今天 天氣 不佳 今天 天氣 很好 Hypothesized word sequence 今天 天氣 很好 今天 天氣 Hypothesized word sequence 今天 天氣 不佳 Feature-based confidence measure (4/8) • Some common predictor features (cont) • Acoustic stability 天氣 很好 今天 Hypothesized word sequence
Feature-based confidence measure (6/8) • We can combine the above features with any one of the following classifiers • Line discriminant function • Generalized linear model • Neural networks • Decision tree • Support vector machine • Boosting • Naïve Bayes classifier
Feature-based confidence measure (7/8) • Naïve Bayes Classifier [3]
Feature-based confidence measure (8/8) • Experiments [3] • Corpus : an Italian speech corpus of phone calls to the front desk of a hotel
Posterior probability based confidence measure (1/11) • Posterior probability of a word sequence : • To adopt some approximation methods Impossible to estimate in a precise manner
Posterior probability based confidence measure (2/11) • Word graph based approximation 三名 候選人 有 三名 三名 候選人 由 靜音 結果 沒有 三名 又 靜音 靜音 沒有 候選人 沒有 通過 建國 有 靜音 三名 候選人 又 通過 候選人 三名
Posterior probability based confidence measure (3/11) • Posterior probability of a word arc : • Some issues are addressed and the word posterior probability is generalized • Reduced search space • Relaxed time registration • Optimal acoustic and language model weights
Posterior probability based confidence measure (4/11) • Posterior probability of a word arc [6] : 三名 候選人 有 三名 三名 由 候選人 靜音 結果 沒有 三名 又 靜音 沒有 候選人 靜音 沒有 通過 建國 靜音 有 三名 候選人 又 通過 候選人 三名
Posterior probability based confidence measure (5/11) • Posterior probability of a word arc [6] : 三名 候選人 有 三名 三名 由 候選人 靜音 結果 沒有 三名 又 靜音 沒有 候選人 靜音 沒有 通過 建國 靜音 有 三名 候選人 又 通過 候選人 三名
Posterior probability based confidence measure (6/11) • Posterior probability of a word arc [6] : 三名 候選人 三名 有 三名 由 候選人 靜音 結果 沒有 三名 又 靜音 沒有 候選人 靜音 沒有 通過 建國 靜音 有 三名 候選人 又 通過 候選人 三名
Posterior probability based confidence measure (7/11) • Posterior probability of a word arc [6] : 三名 候選人 有 三名 三名 由 候選人 靜音 結果 沒有 三名 又 靜音 沒有 候選人 靜音 沒有 通過 建國 靜音 有 三名 候選人 又 通過 候選人 三名
Posterior probability based confidence measure (8/11) • The drawbacks of the above methods – all need an additional pass. • In [8], the “local word confidence measure” is proposed 今天 今天 今天 今天
bigram applied forward/backward bigram applied Posterior probability based confidence measure (8/11) • local word confidence measure (cont)
Posterior probability based confidence measure (9/11) • Impact of word graph density on the quality of posterior probability [9] Baseline 27.3 15.4
Posterior probability based confidence measure (10/11) • Experiments [6]
Explicit model based confidence measure (1/10) • The CM problem is formulated as a statistical hypothesis testing problem. • Under the framework of binary hypothesis testing, there are two complementary hypotheses • We test against
Explicit model based confidence measure (3/10) • The above LRT score can be transformed to a CM based on a monotonic 1-1 mapping function. • The major difficulty with LRT is how to model the alternative hypothesis. • In practice, the same HMM structure is adopted to model the alternative hypothesis. • A discriminative training procedure plays a crucial role in improving modeling performance.
Explicit model based confidence measure (3/10) • Two-pass procedure : 天氣 很好 今天
Explicit model based confidence measure (4/10) • One-pass procedure 天氣 很好 今天
Explicit model based confidence measure (5/10) • How to calculate the confidence of a recognized word?
Explicit model based confidence measure (6/10) • How to calculate the confidence of a recognized word (cont)?
Explicit model based confidence measure (7/10) • Discriminative training [10] • The goal of the training procedure is to increase the average value of for correct hypotheses and decrease the average value of for false acceptance.
Explicit model based confidence measure (8/10) • Discriminative training (cont)
Explicit model based confidence measure (9/10) Why discriminative training works?
Explicit model based confidence measure (10/10) • Experiments [10] • This task, referred to as the “movie locator”,
U A Incorporation of high-level information for CM (1/4) • LSA • The key property of LSA is that words whose vectors are close to each other are semantically similar words. • These similarities can be used to provide an estimate of the likelihood of the words co-occurring within the same utterance.
Incorporation of high-level information for CM (2/4) • LSA (cont) • The entry of matrix : • The confidence of a recognized word :
Incorporation of high-level information for CM (3/4) • Inter-word mutual information :
Incorporation of high-level information for CM (4/4) • Experiments [14]
三名 候選人 有 三名 三名 候選人 由 靜音 結果 沒有 三名 又 靜音 靜音 沒有 候選人 沒有 通過 建國 有 靜音 三名 候選人 又 通過 候選人 三名 The application of CM to improve speech recognition (1/10) • Statistical decision theory aims at minimizing the expected of making error
The application of CM to improve speech recognition (2/10) • Method 1 [16]:
The application of CM to improve speech recognition (3/10) • Method 2 [18] :