100 likes | 292 Views
Parameter Expanded Variational Bayesian Methods Yuan (Alan) Qi and Tommi S. Jaakkola, MIT NIPS 2006. Presented by: John Paisley Duke University, ECE 3/13/2009. Outline. Introduction PX-VB algorithm Applications Bayesian Probit Regression Automatic Relevance Determination
E N D
Parameter Expanded Variational Bayesian MethodsYuan (Alan) Qi and Tommi S. Jaakkola, MITNIPS 2006 Presented by: John Paisley Duke University, ECE 3/13/2009
Outline • Introduction • PX-VB algorithm • Applications • Bayesian Probit Regression • Automatic Relevance Determination • Convergence Properties • Conclusion
Introduction • Variational Bayes is a popular method for approximating the posterior distribution of a model. • Can be slow to converge if variables are strongly correlated • Parameter-expanded methods can speed convergence by adding auxiliary parameters, which can remove the strong coupling of parameters.
PX-VB algorithm Auxiliary variables are added and optimized with each iteration. The original parameters are then recovered by setting the auxiliary variables to the values that recover the original model.
Bayesian Probit Regression • The original model: Where TN is the truncated-Gaussian • The parameter-expanded model: Where q(z_n) and q(w) updated with this is followed by the inverse mapping
Automatic Relevance Determination (RVM) • Separate auxiliary variables As well as an auxiliary variable for \alpha, the details for which are omitted • Shared auxiliary variable The auxiliary variable c is optimized with each iteration using the iterative Newton method, as no closed form solution exists.
Convergence Properties • A general convergence theorem was presented and proven:
Conclusion • The theorem and proof shows that as long as the inverse mapping function, M_a, has a largest eigenvalue smaller than 1, PX-VB is guaranteed to converge faster than VB, with the rate of convergence increasing as this value decreases. • The approach presented was a general method for speeding up VB inference. This was demonstrated on two popular Bayesian models.