70 likes | 188 Views
Constructing Informative Prior Distributions from Domain Knowledge in Text Classification. Graduate : Chen, Shao-Pei Authors : Aynur Dayanik, David D. Lewis, David Madigan, Vladimir Menkov, Alexander Genkin IGIR. Outline. Motivation Objective Methodology
E N D
Constructing Informative Prior Distributions from Domain Knowledge in Text Classification Graduate : Chen, Shao-Pei Authors : Aynur Dayanik, David D. Lewis, David Madigan, Vladimir Menkov, Alexander Genkin IGIR
Outline • Motivation • Objective • Methodology • Experimental Results • Conclusion
Motivation • In operational text classification settings, however, small training sets are the rule, due to the expense and inconvenience of labeling, or skepticism that efforts will be adequately repaid.
Objective • Using domain knowledge texts would greatly improve classifier effectiveness when few training examples are available, and not hurt effectiveness with large training sets.
Methodology Bayesian Logistic Regression Gaussian Priors Laplace Priors 5
Experimental Results 500 Random Example 5 Positive and 5 Random Example 5 Positive and 5 Closest Negative Examples
Conclusion We found large improvements in effectiveness, particularly when only small training sets are available.