1 / 36

Probabilistic Inference Lecture 6 – Part 1

Probabilistic Inference Lecture 6 – Part 1. M. Pawan Kumar pawan.kumar@ecp.fr. Slides available online http:// cvc.centrale-ponts.fr /personnel/ pawan /. Questions? Next Lecture !!. Tree Re-Weighted Message Passing (TRW) vs. Dual Decomposition (DD). Dual of the LP Relaxation.

brandi
Download Presentation

Probabilistic Inference Lecture 6 – Part 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Probabilistic InferenceLecture 6 – Part 1 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online http://cvc.centrale-ponts.fr/personnel/pawan/

  2. Questions?Next Lecture !!

  3. Tree Re-Weighted Message Passing(TRW) vs. Dual Decomposition(DD)

  4. Dual of the LP Relaxation Wainwright et al., 2001 q*(1) Va Vb Vc Va Vb Vc Vd Ve Vf q*(2) Vd Ve Vf Vg Vh Vi q*(3) Vg Vh Vi q*(4) q*(5) q*(6)  Va Vb Vc Dual of LP Vd Ve Vf max  q*(i) Vg Vh Vi  i  

  5. TRW Initialize i. Take care of reparam constraint Choose random variable Va Compute min-marginals of Va for all trees Node-average the min-marginals REPEAT Kolmogorov, 2006

  6. DD maxλi minx,xi ∑igi(xi) + ∑iλiT(xi-x) s.t.xi  C KKT Condition: ∑iλi= 0

  7. DD maxλi minx,xi ∑igi(xi) + ∑iλiTxi s.t.xi  C

  8. DD Initialize λi0= 0 Compute projected supergradients si= argminxi ∑i (gi(xi) + (λit)Txi) pi = si - ∑jsj/m REPEAT Update dual variables λit+1= λit + ηtpi Komodakis et al., 2007

  9. TRW 7.5 -7.5 8.75 8.75 -5 6 6 -3 7.5 l1 1 -5.5 -3 -1 -3 -7 l0 7 -7 6.5 -3 3 -3 6.5 3 7 Vb Vc Va Va Vb Vc 6.5 6.5 7

  10. TRW 7.5 -7.5 8.75 8.75 -5 6 6 -3 7.5 l1 1 -5.5 -3 -1 -3 -7 l0 7 -7 6.5 -3 3 -3 6.5 3 7 Vb Vc Va Va Vb Vc f1(a) = 0 f1(b) = 0 f2(b) = 0 f2(c) = 0 f3(c) = 0 f3(a) = 0 Strong Tree Agreement

  11. DD 7.5 -7.5 8.75 8.75 -5 6 6 -3 7.5 l1 1 -5.5 -3 -1 -3 -7 l0 7 -7 6.5 -3 3 -3 6.5 3 7 Vb Vc Va Va Vb Vc ya;0 ya;1 yb;0 yb;1 yc;0 yc;1 1 0 1 0 - - Optimal LP solution Values of yab;ik not shown. But we know yab;ik = ya;iyb;k

  12. Supergradients 7.5 -7.5 8.75 8.75 -5 6 6 -3 7.5 l1 1 -5.5 -3 -1 -3 -7 l0 7 -7 6.5 -3 3 -3 6.5 3 7 Vb Vc Va Va Vb Vc sa;0 sa;1 sb;0 sb;1 sc;0 sc;1 1 0 1 0 - - - - 1 0 1 0 1 0 - - 1 0

  13. Projected Supergradients 7.5 -7.5 8.75 8.75 -5 6 6 -3 7.5 l1 1 -5.5 -3 -1 -3 -7 l0 7 -7 6.5 -3 3 -3 6.5 3 7 Vb Vc Va Va Vb Vc pa;0 pa;1 pb;0 pb;1 pc;0 pc;1 0 0 0 0 - - - - 0 0 0 0 0 0 - - 0 0

  14. Objective 7.5 -7.5 8.75 8.75 -5 6 6 -3 7.5 l1 1 -5.5 -3 -1 -3 -7 l0 7 -7 6.5 -3 3 -3 6.5 3 7 Vb Vc Va Va Vb Vc 6.5 6.5 7 No further increase in dual objective

  15. DD 7.5 -7.5 8.75 8.75 -5 6 6 -3 7.5 l1 1 -5.5 -3 -1 -3 -7 l0 7 -7 6.5 -3 3 -3 6.5 3 7 Vb Vc Va Va Vb Vc 6.5 6.5 7 No further increase in dual objective Strong Tree Agreement implies DD stops

  16. TRW 4 -2 2 0 1 0 0 0 4 l1 0 -1 0 1 -1 0 l0 8 -2 0 1 -1 2 0 8 -0.2 Vb Vc Va Va Vb Vc 4 0 4

  17. TRW 4 -2 2 0 1 0 0 0 4 l1 0 -1 0 1 -1 0 l0 8 -2 0 1 -1 2 0 8 -0.2 Vb Vc Va Va Vb Vc f1(a) = 1 f1(b) = 1 f2(b) = 1 f2(c) = 0 f3(c) = 1 f3(a) = 1 f2(b) = 0 f2(c) = 1 Weak Tree Agreement

  18. DD 4 -2 2 0 1 0 0 0 4 l1 0 -1 0 1 -1 0 l0 8 -2 0 1 -1 2 0 8 -0.2 Vb Vc Va Va Vb Vc ya;0 ya;1 yb;0 yb;1 yc;0 yc;1 0 1 0 1 - - Optimal LP solution Values of yab;ik not shown. But we know yab;ik = ya;iyb;k

  19. Supergradients 4 -2 2 0 1 0 0 0 4 l1 0 -1 0 1 -1 0 l0 8 -2 0 1 -1 2 0 8 -0.2 Vb Vc Va Va Vb Vc sa;0 sa;1 sb;0 sb;1 sc;0 sc;1 0 1 0 1 - - - - 0 1 1 0 0 1 - - 0 1

  20. Projected Supergradients 4 -2 2 0 1 0 0 0 4 l1 0 -1 0 1 -1 0 l0 8 -2 0 1 -1 2 0 8 -0.2 Vb Vc Va Va Vb Vc pa;0 pa;1 pb;0 pb;1 pc;0 pc;1 0 0 0 0 - - - - 0 0 0.5 -0.5 0 0 - - -0.5 0.5

  21. Update with Learning Rate ηt = 1 4 -2 2 0 1 0 0 0 4 l1 0 -1 0 1 -1 0 l0 8 -2 0 1 -1 2 0 8 -0.2 Vb Vc Va Va Vb Vc pa;0 pa;1 pb;0 pb;1 pc;0 pc;1 0 0 0 0 - - - - 0 0 0.5 -0.5 0 0 - - -0.5 0.5

  22. Objective 4 -2 2 0 1 -0.5 0.5 0 4 l1 0 -1 0 1 -1 0 l0 8 -2 0 1 -1 2 0.5 8 -0.7 Vb Vc Va Va Vb Vc -0.5 4 4.3 Decrease in dual objective

  23. Supergradients 4 -2 2 0 1 -0.5 0.5 0 4 l1 0 -1 0 1 -1 0 l0 8 -2 0 1 -1 2 0.5 8 -0.7 Vb Vc Va Va Vb Vc sa;0 sa;1 sb;0 sb;1 sc;0 sc;1 0 1 0 1 - - - - 1 0 0 1 0 1 - - 1 0

  24. Projected Supergradients 4 -2 2 0 1 -0.5 0.5 0 4 l1 0 -1 0 1 -1 0 l0 8 -2 0 1 -1 2 0.5 8 -0.7 Vb Vc Va Va Vb Vc pa;0 pa;1 pb;0 pb;1 pc;0 pc;1 0 0 -0.5 0.5 - - - - 0.5 -0.5 -0.5 0.5 0 0 - - 0.5 -0.5

  25. Update with Learning Rate ηt = 1/2 4 -2 2 0 1 -0.5 0.5 0 4 l1 0 -1 0 1 -1 0 l0 8 -2 0 1 -1 2 0.5 8 -0.7 Vb Vc Va Va Vb Vc pa;0 pa;1 pb;0 pb;1 pc;0 pc;1 0 0 -0.5 0.5 - - - - 0.5 -0.5 -0.5 0.5 0 0 - - 0.5 -0.5

  26. Updated Subproblems 4 -2 2.25 -0.25 1 -0.25 0.25 0 4 l1 0 -1 0 1 -1 0 l0 8 -2 0.25 1 -1 1.75 0.25 8 -0.45 Vb Vc Va Va Vb Vc

  27. Objective 4 -2 2.25 -0.25 1 -0.25 0.25 0 4 l1 0 -1 0 1 -1 0 l0 8 -2 0.25 1 -1 1.75 0.25 8 -0.45 Vb Vc Va Va Vb Vc 0 4.25 4.25 Increase in dual objective DD goes beyond TRW

  28. DD 4 -2 2.25 -0.25 1 -0.25 0.25 0 4 l1 0 -1 0 1 -1 0 l0 8 -2 0.25 1 -1 1.75 0.25 8 -0.45 Vb Vc Va Va Vb Vc 0 4.25 4.25 Increase in dual objective DD provides the optimal dual objective

  29. Comparison TRW DD Fast Slow Local Maximum Global Maximum Requires MAP Estimate Requires Min-Marginals Also possible in the TRW framework Other forms of subproblems Tighter relaxations Sparse high-order potentials Easier in the DD framework

  30. Subproblems Va Vb Vc Va Vb Vc Vd Ve Vf Vd Ve Vf Vg Vh Vi Vg Vh Vi Binary labeling problem Va Vb Vc Black edges submodular Vd Ve Vf Red edges supermodular Vg Vh Vi

  31. Subproblems Va Vb Va Vb Vc Vd Ve Vf Vh Vi Vg Vh Vi Binary labeling problem Va Vb Vc Black edges submodular Vd Ve Vf Red edges supermodular Vg Vh Vi Remains submodular over iterations

  32. Tighter Relaxations Va Vb Vb Vc Va Vb Vc Vd Ve Ve Vf Vd Ve Vf Vg Vh Vi Vd Ve Ve Vf Vg Vh Vh Vi Relaxation that is tight for the above 4-cycles LP-S + Cycle inequalities

  33. High-Order Potentials Vb Vc Va Vb Vc Va Vb Ve Vf Vd Ve Vf Vd Ve Vg Vh Vi Vg Vh Vi Va Vd Ve Vf Vg Vh Vi

  34. High-Order Potentials Vb Vc Ve Vf Value of Potential θc;y Labeling y for Clique O(h|C|)!! Subproblem: minyθc;y + λTy

  35. Sparse High-Order Potentials Vb Vc Ve Vf Value of Potential θc;y Labeling y for Clique Σaya;0 = 0 Σaya;0 > 0 O(h|C|)!! Subproblem: minyθc;y + λTy

  36. Sparse High-Order Potentials Many useful potentials are sparse Pn Potts Model Uniqueness constraints Covering constraints Pattern-based Potentials And now you can solve them efficiently !!

More Related