320 likes | 634 Views
Pattern Recognition and Machine Learning Chapter 10. Approximate Bayesian Inference I:. Structural Approximations. Falk LIEDER December 2 2010. Introduction Variational Inference Variational Bayes Applications. Statistical Inference. Z. P(Z|X). X. Hidden States.
E N D
Pattern Recognition and Machine Learning • Chapter 10 ApproximateBayesianInference I: StructuralApproximations • FalkLIEDER December 2 2010
IntroductionVariationalInferenceVariational Bayes Applications Statistical Inference Z P(Z|X) X Hidden States Observations Posterior Belief
IntroductionVariationalInferenceVariational Bayes Applications When Do You Need Approximations? The problemwith Bayes theoremisthatitoftenleadstointegralsthatyoudon’tknowhowtosolve. • Noanalyticsolutionfor • Noanalyticsolutionfor • In thediscretecasecomputinghascomplexity • SequentialLearning For Non-Conjugate Priors
IntroductionVariationalInferenceVariational Bayes Applications HowtoApproximate? Samples Structual Approximation Approximation by a Densityof a given Form Evidence /ExpectationsofApproximateDensityare easy tocompute ApproximateDensitybyHistogram Approx. Expectationsby Averages Numericalintegration Approximate Integrals Numerically: • Evidence p(x) • Expectations Infeasibleif Z is high-dimensional
IntroductionVariationalInferenceVariational Bayes Applications HowtoApproximate? StructuralApproximations (VariationalInference) StochasticApproximations (Monte-Carlo-Methods, Sampling) - Time-Intensive + AsymptoticallyExact - Storage Intensive + EasilyApplicable General PurposeAlgorithms + Fast toCompute • Systematic Error + EfficientRepresentation - Applicationoftenrequiresmathematicalderivations + Learning Rules give Insight
IntroductionVariationalInferenceVariational Bayes Applications VariationalInference—An Intuition Target Family KL-Divergence True Posterior VB Approximation ProbabilityDistributions
IntroductionVariationalInferenceVariational Bayes Applications WhatDoesClosestMean? Intuition: Closestmeans minimal additional surprise on average. Kullback-Leibler (KL) divergencemeasuresaverage additional surprise. KL[p||q] measureshowmuchlessaccuratethe belief q isthan p, if p isthetrue belief. KL[p||q]islargestreduction in averagesurprisethatyoucanachieve, ifp isthetrue belief.
IntroductionVariationalInferenceVariational Bayes Applications KL-Divergence Illustration
IntroductionVariationalInferenceVariational Bayes Applications Properties ofthe KL-Divergence • Zero iffbothargumentsareidentical: • Greaterthanzero, iftheyare different: Disadvantage The KL-divergenceis not a metric (distancefunction), because Itis not symmetric . Itdoes not satisfythetriangleinequality.
IntroductionVariationalInferenceVariational Bayes Applications Howto Find theClosest Target Density? • Intuition: MinimizeDistance • Implementations: • Variational Bayes:Minimize • Expectation Propagation: • Arbitrariness • Different Measures Different Algorithms & Different Results • Alternative Schemesarebeingdeveloped,e.g. Jaakola-Jordan variationalmethod, Kikuchi-Approximations
IntroductionVariationalInferenceVariational Bayes Applications Minimizing Functionals • KL-divergenceis a functional
IntroductionVariationalInferenceVariational BayesApplications VB andthe Free-Energy Variational Bayes: ior Problem: Youcan’tevaluatethe KL-divergence, becauseyoucan’tevaluatethe posterior. Solution: Conclusion: • Youcanmaximizethefree-energyinstead. const
IntroductionVariationalInferenceVariational BayesExamples VB: Minimizing KL-DivergenceisequivalenttoMaximizing Free-Energy (q)
IntroductionVariationalInferenceVariational BayesApplications Constrained Free-EnergyMaximization (q) Intuition: • Maximize a LowerBound on the Log Model Evidence • Maximizationisrestrictedtotractabletargetdensities Definition: Properties • The free-energyis maximal forthetrueposterior.
IntroductionVariationalInferenceVariational BayesApplications VariationalApproximations • FactorialApproximations (Meanfield) • Independence Assumption • Optimizationwithrespecttofactordensities • NoRestriction on Functional Form ofthefactors • Approximation byParametricDistributions • Optimization w.r.t. Parameters • VariationalApproximationsfor Model Comparison • Variational Approximation ofthe Log Model Evidence
IntroductionVariationalInferenceVariational BayesExamples Meanfield Approximation Goal: Rewrite as a function of and optimize. Optimize separately for each factor Step 1:
IntroductionVariationalInferenceVariational BayesApplications Meanfield Approximation, Step 1
IntroductionVariationalInferenceVariational BayesApplications Meanfield Approximation, Step 2 Noticethat. The constant must betheevidence, becausehastointegratetoone. Hence,
IntroductionVariationalInferenceVariational Bayes Applications MeanfieldExample True Distribution: with Target Family: VB meanfieldsolution: • +const • Hence, and • Bysymmetry
IntroductionVariationalInferenceVariational Bayes Applications MeanfieldExample Observation: VB-Approximation ismorecompactthantruedensity. Reason: KL[q||p] does not penalizedeviationswhere q iscloseto 0. True Density Approximation UnreasonableAssumptions Poor Approximation
IntroductionVariationalInferenceVariational Bayes Applications KL[q||p] vs. KL[p||q] Variational Bayes • AnalyticallyEasier • Approx. ismorecompact Expectation Propagation • More Involved • Approx. is wider
IntroductionVariationalInferenceVariational Bayes Applications 2. ParametricApproximations • Problem: • Youdon’tknowhowtointegratepriortimeslikelihood. • Solution: • Approximate by . • KL-divergence and free-energy become functions of the parameters • Applystandardoptimizationtechniques. • Setting derivatives tozero Oneequation per parameter. • Solve System ofEquationsby iterative Updating.
IntroductionVariationalInferenceVariational Bayes Applications Parametric Approximation Example Z Goal: LearntheRewardProbability p • Likelihood: , • Prior: • Posterior: Problem Youcannotderive a learningrulefortheexpectedrewardand ist variance, because… • NoAnalyticFormulaforExpectedRewardProbability • Form of Prior Changeswith Every Observation Solution: Approximatethe Posterior by a Gaussian. X
IntroductionVariationalInferenceVariational Bayes Applications Solution Solve
IntroductionVariationalInferenceVariational Bayes Applications Result: A Global Approximation Learning Rulesforexpectedrewardprobabilityandtheuncertaintyaboutit Sequential Learning Algorithm True Posterior Laplace Variational Bayes
IntroductionVariationalInferenceVariational Bayes Applications VB forBayesian Model Selection • Hence, ifis uniform . • Problem: • is “intractable” • Solution: • Justification: • If, then
Motivation & OverviewVI IntuitionVB MathsApplications Summary ApproximateBayesianInference StructuralApproximations Variational Bayes (Ensemble Learning) MeanfieldParametricApprox. Learning Rules, Model Selection