Dynamic data selection in search (1/4)

A Dynamic In-Search Data Selection Method With Its Application to Acoustic Modeling and Utterance Verification (about training)

Dynamic data selection in search (1/4) • In continue speech recognition, it become much more difficult to define the CT or TT set because of unknown unit boundaries. • As a result, any possible segmentation in an utterance could potentially become a competing token. However, an exhaustive search it too expensive to be affordable.

Dynamic data selection in search (2/4) • In this paper, every utterance is recognized with Viterbi beam search algorithm. • All partial paths surviving during beam search always have relatively large likelihood values and usually potentially compete with the true path.

Reference phone segmentation alignment Dynamic data selection in search (3/4)

Dynamic data selection in search (4/4) True token of competing Token of

Application I : acoustic modeling (1/20) • When giving true tokes of => MAP • MAP : training observation sequences with prior information

Application I : acoustic modeling (2/20)

A simpler example (DHMM) :

Application I : acoustic modeling (8/20) (2) (3) (1)

Application I : acoustic modeling (11/20) • When giving true tokes of => MCE • Imposter word : the hypothesized word is wrong, and the likelihood exceeds the likelihood given its reference model. • If we can minimize the total number of imposter word, we can reduce the WER.

Application I : acoustic modeling (12/20) MCE • The misclassification distance measure for the wrong word W is defined as • And then a general form of the “smoothed” count of the imposter word is defined as

Application I : acoustic modeling (13/20) MCE • The so-called GPD algorithm is adopted to minimized the “smoothed” count of the imposter word

Application I : acoustic modeling (14/20) MCE

Application I : acoustic modeling (16/20) Reference model

Application I : acoustic modeling (17/20) Reference model

Dynamic data selection in search (1/4)