170 likes | 517 Views
MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification. J. Zhu, A. Ahmed and E.P. Xing Carnegie Mellon University ICML 2009. Presented By Haojun Chen. Sources: http://www.cs.cmu.edu/~junzhu/medlda.htm. Outline. Motivation
E N D
MedLDA: Maximum Margin Supervised Topic Models forRegression and Classification J. Zhu, A. Ahmed and E.P. Xing Carnegie Mellon University ICML 2009 Presented By Haojun Chen Sources: http://www.cs.cmu.edu/~junzhu/medlda.htm
Outline • Motivation • Supervised topic model (sLDA) and Support vector regression (SVR) • Maximum entropy discrimination LDA (MedLDA) • MedLDA for Regression • MedLDA for Classification • Experiments Results • Conclusion
Motivation • Learning latent topic models with side information, like sLDA, has attracted increasingly attention. • Maximum likelihood estimation are used for posterior inference and parameter estimation in sLDA. • Max-margin methods, such as SVM, for classification have demonstrated success in many applications. • General principle for learning max-margin discriminative supervised latent topic models for both regression and classification is proposed in this paper.
Supervised Topic Model (sLDA) • Joint distribution for sLDA • Variational MLE for sLDA
Support Vector Regression (SVR) • Given a training set , the linear SVR finds an optimal linear function by solving the following constrained convex optimization problem
Max-Entropy Discrimination LDA (MedLDA) • Maximum entropy discrimination LDA (MedLDA): an integration of max-margin prediction models (e.g. SVR and SVM) and hierarchical Bayesian topic models (e.g. LDA and sLDA) • Specifically, a distribution is learned in a max-margin manner in MedLDA. • MedLDA for regression and classification are considered in this paper.
MedLDA for Regression • For regression, MedLDA is defined as an integration of Bayesian sLDA and SVR is the variational approximation for the posterior
EM Algorithm for MedLDA Regression • Variational EM Algorithm: • The key difference between sLDA and MedLDA lies in updating
MedLDA for Classification • Similar to the regression model, the integrated LDA and multi-class classification model is defined as follow: where
EM Algorithm for MedLDA Classification • Similar to the EM algorithm for MedLDA regression • Update equation for
Embedding Results • 20 Newsgroup dataset MedLDA LDA
Classification Results • 20 Newsgroup Data Relative ratio =
Regression Results • Movei Review Data
Conclusion • MedLDA: an integration of max-margin prediction models and hierarchical Bayesian topic models by optimizing a single objective function with a set of expected margin constraints