1 / 14

Presented by: Mingyuan Zhou Duke University, ECE December 18, 2009

A Hierarchical Nonparametric Bayesian Approach to Statistical Language Model Domain Adaptation Frank Wood and Yee Whye Teh AISTATS 2009. Presented by: Mingyuan Zhou Duke University, ECE December 18, 2009. Outline. Background Pitman-Yor Process Hierachical Pitman-Yor Process Language Models

badu
Download Presentation

Presented by: Mingyuan Zhou Duke University, ECE December 18, 2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Hierarchical Nonparametric Bayesian Approach to Statistical Language Model Domain AdaptationFrank Wood and Yee Whye TehAISTATS 2009 Presented by: Mingyuan Zhou Duke University, ECE December 18, 2009

  2. Outline • Background • Pitman-Yor Process • Hierachical Pitman-Yor Process Language Models • Doubly Hierachical Pitman-Yor Process Language Model • Inference • Experimental results • Summary

  3. Background:Language modeling and n-Gram models • “A language model is usually formulated as a probability distribution p(s) over strings s that attempts to reflect how frequently a string s occurs as a sentence”. • n-Gram (n=2: bigram, n=3: trigram) • Smoothing: Reference: S.F. Chen and J.T Goodman. 1998. An empiricalstudy of smoothing techniques for language modeling.Technical Report TR-10-98, Computer ScienceGroup, HarvardUniversity.

  4. Example • Smoothing Reference: S.F. Chen and J.T Goodman. 1998. An empiricalstudy of smoothing techniques for language modeling.Technical Report TR-10-98, Computer ScienceGroup, HarvardUniversity.

  5. Evaluation • Train the n-Gram model: • Calculate: • Cross-entropy: • Perplexity: Reference: S.F. Chen and J.T Goodman. 1998. An empiricalstudy of smoothing techniques for language modeling.Technical Report TR-10-98, Computer ScienceGroup, HarvardUniversity.

  6. Dirichlet Process and Pitman-Yor Process • Dirichlet Process Number of unique words grows at • Pitman-Yor Process Number of unique words grows at • When d=0, Pitman-Yor Process reduces to DP • Both can be understood through the Chinese Restaurant process DP Pitman-Yor Sitting at Table k Sitting at new Table

  7. Power-law properties of the Pitman-Yor Process Number of unique words Proportion of words appearing once Number of words Number of words

  8. Hierachical Pitman-Yor Process Language Models

  9. Doubly Hierachical Pitman-Yor Process Language Model

  10. Doubly Hierachical Pitman-Yor Process Language Model

  11. Inference • Direchlet Process, Chinese Restaurant Process • Hierachical Direchlet Process, Chinese Restaurant Franchise • Pitman-Yor Process, Chinese Restaurant Process • Hierachical Pitman-Yor Process, Chinese Restaurant Franchise • Doubly Hierachical Pitman-Yor Language Model, Graphical Pitman-Yor Process, Multi-floor Chinese Restaurant Process, Multi-floor Chinese Restaurant Franchise

  12. Experimental results (HPYLM)

  13. Experimental results (DHPYLM)

  14. Summary • DHPYLM achieves encouraging domain adaptation results. • A graphical Pitman-Yor process is constructed and a multi-floor Chinese restaurant representation is proposed for doing sampling. • DHPYLM may be integrated into topic models to eliminate “bag-of-words” assumptions.

More Related