230 likes | 565 Views
VIBES Variational Inference Engine For Bayesian Networks. John Winn. Inference Group, Cavendish Laboratory. in association with Chris Bishop (Microsoft Research, Cambridge) David Spiegelhalter (MRC Biostatistics Unit, Cambridge). 30 th October 2002. Overview. Bayesian Networks
E N D
VIBESVariational Inference Engine For Bayesian Networks John Winn Inference Group, Cavendish Laboratory in association with Chris Bishop (Microsoft Research, Cambridge) David Spiegelhalter (MRC Biostatistics Unit, Cambridge) 30th October 2002
Overview • Bayesian Networks • Variational Inference • VIBES • Digit data demo
P(b) P(f) Battery Fuel • Nodes represent variables P(g|b,f) • Links show dependencies Gauge P(t|b) • Conditional distributions at each node P(s|f,t) TurnsOver Starts • Network defines a joint distribution: P(b,f,g,t,s) = P(b)P(f)P(g|b,f)P(t|b)P(s|f,t) Bayesian Networks • Directed graph
Battery Fuel Gauge Inference involves finding: TurnsOver P(H1, H2…| V) Starts Inference in Bayes Nets Observed variables V and hidden variables H. Hidden Observed (car doesn’t start)
Approximate true posterior with: variational distribution Approximate Inference We want to find: typically can’t evaluate
Variational Inference (in three easy steps…) • Choose Q(H|θ), a variational distribution with parameters θ. • Use KL divergence as a measure of ‘distance’ between P and Q. • Change parameters θ to minimiseKL(Q||P)
KL Divergence For unimodal Q and bi-modal P: Q Minimising KL(Q||P) P Minimising KL(P||Q) Q P
Minimising the KL divergence Start with the logevidence: This is L(Q) - a lower bound on ln P(V).
KL(Q || P) ln P(V) L(Q) Minimising the KL divergence ‘distance’ evidence bound
Battery Battery Fuel Fuel Gauge Gauge TurnsOver TurnsOver Starts P Q Factorised Qdistribution One possible choice of Q: i
Maximising the bound For any factor, can maximise the bound in one step:
xn tn W N Example: Bayesian PCA Reduced data Component ‘selector’ Components Noise Data Plate “N copies of”
PCA update equations Can we do this automatically ? from Bishop 1999
e.g. Gaussian Exponential Family . . . Y1 Yk Choose conditionals P(X|Y) to be from the exponential family: X natural parameter vector u function
conjugacy requires that: parentu function Conjugate-Exponential . . . Y1 Yk . . . cpj X . . . Z1 Zj
myx mzjx mz1x Variational update for X Variational message passing Y1 Yk X . . . Z1 Zj
mxβ mxμ Example: Gaussian mean (Gaussian) precision (Gamma) μ β x
A function node f(A,B) is tractable if: Deterministic nodes Allow addition, multiplication, mixtures… P(X | A.B + C,Y) A B × C + Y X
VIBES Variational Inference in Bayesian networks • Assumes factorised Q • Local updates by message passing • Works with any conjugate-exponential model • Just draw your model, specify distributions and press go!
Bayesian PCA (again) xn W tn N
Work in progress • Allowing structure in the Q distribution (no longer fully factorised) • First release version of VIBES