220 likes | 419 Views
Letter to Phoneme Alignment. Reihaneh Rabbany Shahin Jabbari. Outline. Motivation Problem and its Challenges Relevant Works Our Work Formal Model EM Dynamic Bayesian Network Evaluation Letter to Phoneme Generator AER Result. Text to Speech Problem.
E N D
Letter to Phoneme Alignment Reihaneh Rabbany Shahin Jabbari
Outline • Motivation • Problem and its Challenges • Relevant Works • Our Work • Formal Model • EM • Dynamic Bayesian Network • Evaluation • Letter to Phoneme Generator • AER • Result
Text to Speech Problem • Conversion of Text to Speech: TTS • Automated Telecom Services • E-mail by Phone • Banking Systems • Handicapped People
Phonetic Analysis Word Pronunciation Pronunciation • Pronunciation of the words • Dictionary Words • Non-Dictionary Words • Phonetic Analysis • Dictionary Look-up • Language is alive, new words add • Proper Nouns
Outline • Motivation • Problem and its Challenges • Relevant Works • Our Work • Formal Model • EM • Dynamic Bayesian Network • Evaluation • Letter to Phoneme Generator • AER • Result
L2P Problem • Letter to Phoneme Alignment • Letter: c a k e • Phoneme: k ei k
Challenges • No Consistency • City / s / • Cake / k / • Kid / k / • No Transparency • K i d (3) / k i d / (3) • S i x (3) / s i k s / (4) • Q u e u e (5) / k j u: / (3) • A x e (3) / a k s / (3)
Outline • Motivation • Problem and its Challenges • Relevant Works • Our Work • Formal Model • EM • Dynamic Bayesian Network • Evaluation • Letter to Phoneme Generator • AER • Result
One-to-one EMDaelemanset.al., 1996 • Length of word = pronunciation • Produce all possible alignments • Inserting null letter/phoneme • Alignment probability
Decision TreeBlack et.al., 1996 • Train a CART Using Aligned Dictionary • Why CART? • A Single Tree for Each Letter
Kondrak • Alignments are not always one-to-one • A x e / a k s / • B oo k /b ú k / • Only Null Phoneme • Similar to one-to-one EM • Produce All Possible Alignments • Compute the Probabilities
Outline • Motivation • Problem and its Challenges • Relevant Works • Our Work • Formal Model • EM • Dynamic Bayesian Network • Evaluation • Letter to Phoneme Generator • AER • Result
Formal Model • Word: sequence of letters • Pronunciation: sequence of phonemes • Alignment: sequence of subalignments • Problem: Finding the most probable alignment
Many-to-Many EM 1. Initialize prob(SubAlignmnets) // Expectation Step 2. For each word in training_set 2.1. Produce all possible alignments 2.2. Choose the most probable alignment // Maximization Step 3. For all subalignments 3.1. Compute new_p(SubAlignmnets)
Dynamic Bayesian Network • Model • Subaligments are considered as hidden variables • Learn DBN by EM li Pi ai
Context Dependent DBN • Context independency assumption • Makes the model simpler • It is not always a correct assumption • Example: Chat and Hat • Model li Pi ai-1 ai
Outline • Motivation • Problem and its Challenges • Relevant Works • Our Work • Formal Model • EM • Dynamic Bayesian Network • Evaluation • Letter to Phoneme Generator • AER • Result
Evaluation Difficulties • Unsupervised Evaluation • No Aligned Dictionary • Solutions • How much it boost a supervised module • Letter to Phoneme Generator • Comparing the result with a gold alignment • AER
Letter to Phoneme Generator • Percentage of correctly generated phonemes and words • How it works? • Finding Chunks • Binary Classification Using Instance-Based-Learning • Phoneme Prediction • Phoneme is predicted independently for each letter • Phoneme is predicted for each chunk • Hidden Markov Model
Alignment Error Ratio • AER • Evaluating by Alignment Error Ratio • Counting common pairs between • Our aligned output • Gold alignment • Calculating AER
Outline • Motivation • Problem and its Challenges • Relevant Works • Our Work • Formal Model • EM • Dynamic Bayesian Network • Evaluation • Letter to Phoneme Generator • AER • Result
Results • 10 fold cross validation