140 likes | 275 Views
A Hierarchical Nonparametric Bayesian Approach to Statistical Language Model Domain Adaptation Frank Wood and Yee Whye Teh AISTATS 2009. Presented by: Mingyuan Zhou Duke University, ECE December 18, 2009. Outline. Background Pitman-Yor Process Hierachical Pitman-Yor Process Language Models
E N D
A Hierarchical Nonparametric Bayesian Approach to Statistical Language Model Domain AdaptationFrank Wood and Yee Whye TehAISTATS 2009 Presented by: Mingyuan Zhou Duke University, ECE December 18, 2009
Outline • Background • Pitman-Yor Process • Hierachical Pitman-Yor Process Language Models • Doubly Hierachical Pitman-Yor Process Language Model • Inference • Experimental results • Summary
Background:Language modeling and n-Gram models • “A language model is usually formulated as a probability distribution p(s) over strings s that attempts to reflect how frequently a string s occurs as a sentence”. • n-Gram (n=2: bigram, n=3: trigram) • Smoothing: Reference: S.F. Chen and J.T Goodman. 1998. An empiricalstudy of smoothing techniques for language modeling.Technical Report TR-10-98, Computer ScienceGroup, HarvardUniversity.
Example • Smoothing Reference: S.F. Chen and J.T Goodman. 1998. An empiricalstudy of smoothing techniques for language modeling.Technical Report TR-10-98, Computer ScienceGroup, HarvardUniversity.
Evaluation • Train the n-Gram model: • Calculate: • Cross-entropy: • Perplexity: Reference: S.F. Chen and J.T Goodman. 1998. An empiricalstudy of smoothing techniques for language modeling.Technical Report TR-10-98, Computer ScienceGroup, HarvardUniversity.
Dirichlet Process and Pitman-Yor Process • Dirichlet Process Number of unique words grows at • Pitman-Yor Process Number of unique words grows at • When d=0, Pitman-Yor Process reduces to DP • Both can be understood through the Chinese Restaurant process DP Pitman-Yor Sitting at Table k Sitting at new Table
Power-law properties of the Pitman-Yor Process Number of unique words Proportion of words appearing once Number of words Number of words
Inference • Direchlet Process, Chinese Restaurant Process • Hierachical Direchlet Process, Chinese Restaurant Franchise • Pitman-Yor Process, Chinese Restaurant Process • Hierachical Pitman-Yor Process, Chinese Restaurant Franchise • Doubly Hierachical Pitman-Yor Language Model, Graphical Pitman-Yor Process, Multi-floor Chinese Restaurant Process, Multi-floor Chinese Restaurant Franchise
Summary • DHPYLM achieves encouraging domain adaptation results. • A graphical Pitman-Yor process is constructed and a multi-floor Chinese restaurant representation is proposed for doing sampling. • DHPYLM may be integrated into topic models to eliminate “bag-of-words” assumptions.