340 likes | 445 Views
Belief Propagation in a Continuous World. Andrew Frank 11/02/2009 Joint work with Alex Ihler and Padhraic Smyth. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A. Graphical Models. Nodes represent random variables. Edges represent dependencies.
E N D
Belief Propagation in a Continuous World Andrew Frank 11/02/2009 Joint work with Alex Ihler and Padhraic Smyth TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A
Graphical Models • Nodes represent random variables. • Edges represent dependencies. A C A C B A C B B
Markov Random Fields B D B D A C E B D A C | B B E | C, D A C E A C E
Factoring Probability Distributions Independence relations factorization C A B D p(A,B,C,D) = f(A) f(B) f(C) f(D) f(A,B) f(B,C) f(B,D)
Toy Example: A Day in Court G I I A G V E G I W A, E, W є {“Innocent”, “Guilty”} V є {“Not guilty verdict”, “Guilty verdict”}
Inference • Most probable explanation: • Marginalization:
Belief Propagation mAE(E) mEV(V) A V E mWE(E) W
Loopy BP A A B B D D C C Does this work? Does it make any sense?
A Variational Perspective • Reformulate the problem: “Tractable” distributions Best tractable approximation, Q True distribution, P Find Q to minimize the divergence.
Choose an Approximating Family • Desired traits: • Simple enough to enable easy computation • Complex enough to represent P e.g. Fully factored: Structured:
Choose a Divergence Measure Common choices: • Kullback-Liebler divergence: • Alpha divergence:
Behavior of α-Divergence Source: T. Minka. Divergence measures and message passing. Technical Report MSR-TR-2005-173, Microsoft. Research, 2005.
Resulting Algorithms Assuming a fully-factored form of Q, we get…* • Mean field, α = 0 • Belief propagation, α = 1 • Tree-reweighted BP, α ≥ 1 * By minimizing “local divergence”: Q(X1, X2, …, Xn) = f(X1) f(X2) … f(Xn)
Local vs. Global Minimization Source: T. Minka. Divergence measures and message passing. Technical Report MSR-TR-2005-173, Microsoft. Research, 2005.
Sensor Localization C B A
Protein Side Chain Placement RTDCYGN +
Common traits? ? Continuous state space:
Easy Solution: Discretize! Domain size: d = 400 Domain size: d = 100 20 bins 10 bins Each message: O(d2) 20 bins 10 bins
Particle BP We’d like to pass “continuous messages”… mAB(B) A B B D Instead, pass discrete messages over sets of particles: C { b(i)} ~ WB(B) . . . b(1) b(2) b(N) mAB({b(i)})
PBP: Computing the Messages Re-write as an expectation: Finite-sample approximation:
Choosing“Good” Proposals A B D Proposal should “match” the integrand. C Sample from the belief:
Iteratively Refine Particle Sets (2) f(xs, xt) (1) (3) (1) (3) Xs Xt • Draw a set of particles, {xs(i)} ~ Ws(xs). • Discrete inference over the particle discretization. • Adjust Ws(xs)
Benefits of PBP • No distributional assumptions. • Easy accuracy/speed trade-off. • Relies on an “embedded” discrete algorithm. Belief propagation, mean field, tree-reweighted BP…
Exploring PBP: A Simple Example xs ||xs – xt||
Continuous Ising Model Marginals Approximate * Run with 100 particles per node Exact Mean Field PBP α = 0 PBP α = 1 TRW PBP α = 1.5
Estimating the Partition Function • Mean field provides a lower bound. • Tree-reweighted BP provides an upper bound. p(A,B,C,D) = f(A) f(B) f(C) f(D) f(A,B) f(B,C) f(B,D) Z = f(A) f(B) f(C) f(D) f(A,B) f(B,C) f(B,D)
Conclusions • BP and related algorithms are useful! • Particle BP let’s you handle continuous RVs. • Extensions to BP can work with PBP, too. Thank You!