100 likes | 217 Views
Proposal: Continue exploring factored tandem models and prosody. Arthur Kantor. phoneState. phoneState. PLPs. dg1. pl1. rd. log MLP outputs, concatenated + KLT. log outputs of separate MLPs. PLPs. Factored tandem observations. tandem. Goals:
E N D
Proposal: Continue exploring factored tandem models and prosody Arthur Kantor
phoneState phoneState PLPs dg1 pl1 rd log MLP outputs, concatenated + KLT . . . log outputs of separate MLPs PLPs Factored tandem observations tandem • Goals: • Find appropriate weights for dg1, pl1, rd, PLP … to optimize word error rate • Find some clustering to optimize word error rate factored tandem
Weight tuning in detail • Explore the interaction between observation stream weights and the language model • As the stream weights increase, the language model weight should also be increased • the increases are not proportional • As the observation becomes more factored, there are more mixture weights, but there is less ability to represent correlation • Separate the effects of factoring the observations from increasing the number of weight parameters • Can be tested by keeping the mixture weights constant in all the factors
phoneState phoneState PLPs dg1 pl1 rd log MLP outputs, concatenated + KLT . . . log outputs of separate MLPs PLPs phoneState semi factored dg1+pl1 rd . . . PLPs Observation factoring in detail • There is a range of partially factored observation models unfactored fully factored How to cluster?
Add a PROSODY featureto feature – based models • PROSIDY feature takes on 4 values: • Onset • reduced nucleus • regular nucleus • coda • Prosody combined with Lips Tongue and Glottis uniquely specifies all of the phones used in our phone-based models • This allows for a more fair comparison with the phone based features • Goals: • repeat the workshop experiments with the added PROSODY feature • Explore higher-level prosodic structure, such as phrasal stress and prosodic phrase boundaries
Proposal • Continue to explore • Tandem observation factoring • Feature substitution in the pronunciation model • Prosody
Feature substitution in the pronunciation model • Feature-based pronunciation modeling is promising • Goal: Make use of this in a speech recognizer
word word ind1 ind1 U1 U1 sync1,2 sync1,2 ind2 ind2 sync2,3 sync2,3 U2 U2 ind3 ind3 U3 U3 Obs Obs Asynchrony between underlying (dictionary) feature values
word word ind1 ind1 U1 U1 sync1,2 sync1,2 S1 S1 ind2 ind2 sync2,3 sync2,3 U2 U2 S2 S2 ind3 ind3 U3 S3 S3 U3 Obs Obs Asynchrony with feature substitution