1 / 10

Proposal: Continue exploring factored tandem models and prosody

Proposal: Continue exploring factored tandem models and prosody. Arthur Kantor. phoneState. phoneState. PLPs. dg1. pl1. rd. log MLP outputs, concatenated + KLT. log outputs of separate MLPs. PLPs. Factored tandem observations. tandem. Goals:

zev
Download Presentation

Proposal: Continue exploring factored tandem models and prosody

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Proposal: Continue exploring factored tandem models and prosody Arthur Kantor

  2. phoneState phoneState PLPs dg1 pl1 rd log MLP outputs, concatenated + KLT . . . log outputs of separate MLPs PLPs Factored tandem observations tandem • Goals: • Find appropriate weights for dg1, pl1, rd, PLP … to optimize word error rate • Find some clustering to optimize word error rate factored tandem

  3. Weight tuning in detail • Explore the interaction between observation stream weights and the language model • As the stream weights increase, the language model weight should also be increased • the increases are not proportional • As the observation becomes more factored, there are more mixture weights, but there is less ability to represent correlation • Separate the effects of factoring the observations from increasing the number of weight parameters • Can be tested by keeping the mixture weights constant in all the factors

  4. phoneState phoneState PLPs dg1 pl1 rd log MLP outputs, concatenated + KLT . . . log outputs of separate MLPs PLPs phoneState semi factored dg1+pl1 rd . . . PLPs Observation factoring in detail • There is a range of partially factored observation models unfactored fully factored How to cluster?

  5. Add a PROSODY featureto feature – based models • PROSIDY feature takes on 4 values: • Onset • reduced nucleus • regular nucleus • coda • Prosody combined with Lips Tongue and Glottis uniquely specifies all of the phones used in our phone-based models • This allows for a more fair comparison with the phone based features • Goals: • repeat the workshop experiments with the added PROSODY feature • Explore higher-level prosodic structure, such as phrasal stress and prosodic phrase boundaries

  6. Questions

  7. Proposal • Continue to explore • Tandem observation factoring • Feature substitution in the pronunciation model • Prosody

  8. Feature substitution in the pronunciation model • Feature-based pronunciation modeling is promising • Goal: Make use of this in a speech recognizer

  9. word word ind1 ind1 U1 U1 sync1,2 sync1,2 ind2 ind2 sync2,3 sync2,3 U2 U2 ind3 ind3 U3 U3 Obs Obs Asynchrony between underlying (dictionary) feature values

  10. word word ind1 ind1 U1 U1 sync1,2 sync1,2 S1 S1 ind2 ind2 sync2,3 sync2,3 U2 U2 S2 S2 ind3 ind3 U3 S3 S3 U3 Obs Obs Asynchrony with feature substitution

More Related