140 likes | 217 Views
Introduction of Grphones. Dong Wang 05/05/2008. Content. Grphones Graphone-based LVCSR Graphone-based STD. Graphones. Suppose graphemes and phonemes are two streams from a single stochastic process. Example speaking s p ea k i ng [spi:king] [s] [p] [i:] [k] [i] [ng].
E N D
Introduction of Grphones Dong Wang 05/05/2008
Content • Grphones • Graphone-based LVCSR • Graphone-based STD
Graphones • Suppose graphemes and phonemes are two streams from a single stochastic process • Example speaking s p ea k i ng [spi:king] [s] [p] [i:] [k] [i] [ng]
Graphones • The grapheme-phoneme join units are called graphones. • Suppose no context dependence among graphones, leading to simplest graphone model. With a known alignement L, the join probability can be written: • The whole work is to define u and estimate p(u)
Graphones • If the graphon model is ready, we can estimate the phoneme sequence from a grapheme sequence, and vice versa. Deligne, Sabine / Yvon, Francois / Bimbot, Frédéric (1995): "Variable-length sequence matching for phonetic transcription using joint multigrams", In EUROSPEECH-1995, 2243-22
Graphones • As the alignment is unkown in the training corpus (dictionary), an EM procedure can be used, with the alignment as latent variable. Z E E E E P z i p
Graphones • Iterative process is as the following, where c is the counts of occurrence: • A forward-back process is used to avoid redundant computation
Graphones • Some tricks • Null grapheme or phoneme segment is allowed, however null-null graphones are not allowed • Mutual information could be used to estimate the model accuracy among different length variables • In English, I used gg length from 0-3, while pp length from 0-1.
Graphones • Some experiments • Mutual information: gr(0-3)ph(0-1): 0.86 gr(0-1)ph(0-1): 0.58 • High-probable graphones A+ax 7.587430e-01 E+iy 6.197528e-02 I+ay 4.753983e-02 O+ow 4.640152e-02 A+ 1.886822e-02 +ax 1.882734e-02 VE+v 1.074523e-02 ER+er 8.942506e-03 LL+l 8.298488e-03 CH+ch 5.454664e-03 SS+s 2.709674e-03
Graphone-based LVCSR • M. Bisani, H. Ney , Multigram-based Grapheme-to-Phoneme Conversion for LVCSR , In Proc. Eurospeech, Geneva, Switzerland, 2003 Transcribe lexicon for new words using graphone models
Graphone-based STD • Using multi-gram model to generate graphone forms for out-of-vocabulary words. • Train hybrid language models which contains both in-vocabulary words and graphones. • Decoding using the lexicon expanded with graphones. • Searching INV words as in word lattices, and OOV words as in phoneme lattices. Murat Akbacak, Dimitra Vergyri, Andreas Stolcke ,OPEN-VOCABULARY SPOKEN TERM DETECTION USING GRAPHONE-BASED HYBRID RECOGNITION SYSTEMS , ICASSP08, Los Angels, USA.
Graphone-based STD • Only the hybrid system can detect OOV words • For INV words, the hybrid system works better also. Murat Akbacak, Dimitra Vergyri, Andreas Stolcke ,OPEN-VOCABULARY SPOKEN TERM DETECTION USING GRAPHONE-BASED HYBRID RECOGNITION SYSTEMS , ICASSP08, Los Angels, USA.
Graphone-based STD • What is the difference between grahpone and phoneme based STD, considering OOV? • How if we use decision trees to perform the LTS? • How if we train the multi-gram using the whole text corpus, instead of the dictionary, hence including the frequency information?
Conclusions • Graphone model is an alternative for decision trees to performance LTS. • Graphone models can be used to detect multi-letter graphemes • Word-subword hybrid system seems interesting.