1 / 24

Focused Belief Propagation for Query-Specific Inference

Focused Belief Propagation for Query-Specific Inference. Anton Chechetka Carlos Guestrin. 14 May 2010. Query-Specific inference problem. q uery . known. n ot interesting. Using information about the query to speed up convergence of belief propagation for the query marginals.

chacha
Download Presentation

Focused Belief Propagation for Query-Specific Inference

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Focused Belief Propagation for Query-Specific Inference Anton ChechetkaCarlos Guestrin 14 May 2010

  2. Query-Specific inference problem query known not interesting Using information about the queryto speed up convergence of belief propagation for the query marginals

  3. Graphical models • Talk focus: pairwise Markov random fields • Paper:arbitrary factor graphs (more general) • Probability is a product of local potentials • Graphical structure •  Compact representation •  Intractable inference •  Approximate inference often works well in practice potential k r h i variable j s u

  4. (loopy) Belief Propagation • Passing messages along edges • Variable belief: • Update rule: • Result: all single-variable beliefs k r h i j s u

  5. (loopy) Belief Propagation • Update rule: • Message dependencies are local: • Round–robin schedule • Fix message order • Apply updates in that order until convergence dependence k r h i dep. j s u dependence

  6. Dynamic update prioritization large change small change • Fixed update sequence is not the best option • Dynamic update scheduling can speed up convergence • Tree-Reweighted BP [Wainwright et. al., AISTATS 2003] • Residual BP [Elidan et. al. UAI 2006] • Residual BP  apply the largest change first large change small change 1 2 large change small change informative update wasted computation

  7. Residual BP [Elidan et. al., UAI 2006] • Update rule: • Pick edge with largest residual • Update new old More effort on the difficult parts of the model  But no query 

  8. Our contributions • Using weighted residuals to prioritize updates • Define message weights reflecting the importance of the message to the query • Computing importance weights efficiently • Experiments: faster convergence on large relational models

  9. Why edge importance weights? query residual < residualwhich to update?? • Residual BP updates • no influence on the query • wasted computation • want to update • influence on the queryin the future Our work  max approx. eventual effect on P(query) Residual BP  max immediate residual reduction

  10. Query-Specific BP • Update rule: • Pick edge with • Update new old edgeimportance the only change! Rest of the talk: defining and computing edge importance

  11. Edge importance base case • Pick edge with approximate eventual update effect on P(Q) query Base case: edge directly connected to the queryAji=?? k r h i j s u change in query belief change in message ||P(NEW)(Q)  P(OLD)(Q)||  ||m(NEW)  m(OLD)||  1 1 ji ji ||m|| ||P(Q)|| tight bound ji

  12. Edge importance one step away query mji mji Edge one step away from the query: Arj=?? sup sup mrj mrj ||P(Q)||  ||m || k r over values ofall other messages h ji i change in query belief  j s u ||m||  rj change in message message importance can compute in closed formlooking at only fji [Mooij, Kappen; 2007]

  13. Edge importance general case  Base case: Aji=1 k r h  query  i One step away: Arj= j s u mji sup mrj Generalization? • expensive to compute • bound may be infinite ||P(Q)||  ||msh|| P(Q) sup msh P(Q) mhr mrj mji sup sup sup sup msh    msh mrj mhr  sensitivity(): max impact along the path 

  14. Edge importance general case 2  1 query    k r h sensitivity(): max impact along the path  i P(Q) mhr mrj mji j s u sup sup sup sup msh • Ash= max all paths from to query sensitivity()  msh mhr mrj h There are a lot of paths in a graph,trying out every one is intractable 

  15. Efficient edge importance computation There are a lot of paths in a graph,trying out every one is intractable  always decreases as the path grows Dijkstra’s(shortest paths) alg. will efficiently find max-sensitivity pathsfor every edge  always 1 always 1 always 1   sensitivity( hrji) = decomposes into individual edge contributions mhr mrj mji sup sup sup mhr msh mrj • A= max all paths from to query sensitivity()

  16. Query-Specific BP • Run Dijkstra’salgstarting at query to get edge weights • Pick edge with largest weighted residual • Update Aji= max all paths from itoquery sensitivity() More effort on the difficult parts of the model and relevant Takes into account not only graphical structure, but also strength of dependencies

  17. Outline • Using weighted residuals to prioritize updates • Define message weights reflecting the importance of the message to the query • Computing importance weights efficiently • As an initialization step before residual BP • Restore anytime behavior of residual BP • Experiments: faster convergence on large relational models

  18. Big picture Before After ? ? long initialization all marginals all marginals anytimeproperty Dijkstra BP query marginals all marginals This is broken!

  19. Interleaving BP and Dijkstra’s • Dijkstra’sexpands the highest weight edges first • Can pause it at any time and get the most relevant submodel query full model Dijkstra Dijkstra When to stop BP and resume Dijkstra’s? BP BP BP Dijkstra …

  20. Interleaving • Dijkstra’s expands the highest weight edges first query expanded onprevious iteration not yet expanded just expanded minexpanded edgesA  A suppose upper bound on priority actual priority of  M  minexpanded A no need to expand further at this point

  21. Any-Time query-specific BP query Query-specific BP: Dijkstra’s alg. BP updates k r same BP update sequence! i j s Anytime QSBP:

  22. Experiments – the setup • Relational model: semi-supervised people recognition in collection of images [with Denver Dash and MatthaiPhilipose @ Intel] • One variable per person • Agreement potentials for people who look alike • Disagreement potentials for people in the same image • 2K variables, 900K factors • Error measure: KL from the fixed point of BP agree disagree

  23. Experiments – convergence speed AnytimeQuery-Specific BP(with Dijkstra’s interleaving) Residual BP Query-Specific BP better

  24. Conclusions • Prioritize updates by weighted residuals • Takes query info into account • Importance weights depend on both the graphical structure and strength of local dependencies • Efficient computation of importance weights • Much faster convergence on large relational models Thank you!

More Related