130 likes | 291 Views
Semantic n-gram language modeling with the latent maximum entropy principle. Shaojun Wang, Dale Schuurmans, Fuchun Peng and unxin Zhao. Reference. Shaojun Wang,Dale Schuurmans, Yunxin Zhao ,“The Latent Maximum Entropy Principle”. Introduction.
E N D
Semantic n-gram language modeling with the latent maximum entropy principle Shaojun Wang, Dale Schuurmans, Fuchun Peng and unxin Zhao NTNU Speech Lab
Reference • Shaojun Wang,Dale Schuurmans, Yunxin Zhao ,“The Latent Maximum Entropy Principle” NTNU Speech Lab
Introduction • There are various kinds of language models that can be used to capture different aspects of NL • Several techniques for combining language model have been investigated NTNU Speech Lab
Latent Maximum Entropy • Given features f1,…,fN Subject to • Regularized LME principle Subject to NTNU Speech Lab
EM-IS algorithm NTNU Speech Lab
Combining N-gram and PLSA Models • X=(W2,W1,W0,D,T2,T1,T0) NTNU Speech Lab
Tri-gram constraints NTNU Speech Lab
PLSA constraints NTNU Speech Lab
Efficient Feature Expectation and Inference Normalization constant: Feature expectations: NTNU Speech Lab
Computation in Testing NTNU Speech Lab
Experimental results • Training • NAB 87,000 documents, 38 millions words, 1987~1989 • Testing • 32,5000 words from 1989 • EM loop = 5, IIS loop= 20, |T|=125 NTNU Speech Lab
Conclusion • LME provide a general statistical framework for incorporating arbitrary aspects of natural language into a parametric model NTNU Speech Lab