650 likes | 755 Views
Tractable Higher Order Models in Computer Vision ( Part II ). Presented by Xiaodan Liang. Slides from Carsten Rother, Sebastian Nowozin , Pusohmeet Khli Microsoft Research Cambridge. Part II. Submodularity Move making algorithms Higher-order model : P n Potts model.
E N D
Tractable Higher Order Models in Computer Vision (Part II) Presented by Xiaodan Liang Slides from Carsten Rother,Sebastian Nowozin, PusohmeetKhli Microsoft Research Cambridge
Part II • Submodularity • Move making algorithms • Higher-order model : Pn Potts model
Factoring distributions Problem inherently combinatorial!
Key property: Diminishing returns Selection A = {} Selection B = {X2,X3} Y“Sick” Y“Sick” X2“Rash” X3“Male” X1“Fever” Adding X1will help a lot! Adding X1doesn’t help much Theorem [Krause, Guestrin UAI ‘05]: Information gain F(A) in Naïve Bayes models is submodular! New feature X1 + s B Large improvement Submodularity: A + s Small improvement
~63% Why is submodularity useful? Theorem [Nemhauser et al ‘78] Greedy maximization algorithm returns Agreedy: F(Agreedy) ¸ (1-1/e) max|A| k F(A) • Greedy algorithm gives near-optimal solution! • For info-gain: Guarantees best possible unless P = NP! [Krause, Guestrin UAI ’05]
Submodularity in Machine Learning • Many ML problems are submodular, i.e., for F submodular require: • Minimization: A* = argmin F(A) • Structure learning (A* = argmin I(XA; XV\A)) • Clustering • MAP inference in Markov Random Fields • … • Maximization: A* = argmax F(A) • Feature selection • Active learning • Ranking • …
A [ B AÅB Submodular set functions • Set function F on V is called submodular if • Equivalent diminishing returns characterization: + ¸ + B A + S B Large improvement Submodularity: A + S Small improvement
Closedness properties F1,…,Fm submodular functions on V and 1,…,m > 0 Then: F(A) = ii Fi(A) is submodular! Submodularity closed under nonnegative linear combinations! Extremely useful fact!! • F(A) submodular ) P() F(A) submodular! • Multicriterion optimization: F1,…,Fm submodular, i¸0 )i i Fi(A) submodular
Submodularity and Concavity g(|A|) |A|
Maximum of submodular functions Suppose F1(A) and F2(A) submodular. Is F(A) = max(F1(A),F2(A))submodular? F(A) = max(F1(A),F2(A)) F1(A) F2(A) |A| max(F1,F2) not submodular in general!
Minimum of submodular functions Well, maybe F(A) = min(F1(A),F2(A)) instead? F({b}) – F(;)=0 < F({a,b}) – F({a})=1 min(F1,F2) not submodular in general! But stay tuned
x{b} 2 1 x{a} -1 0 1 -2 The submodular polyhedron PF Example: V = {a,b} x({b}) · F({b}) PF x({a,b}) · F({a,b}) x({a}) · F({a})
w{b} 2 1 w{a} -1 0 1 -2 Example: Lovasz extension g(w) = max {wT x: x2PF} g([0,1]) = [0,1]T [-2,2] = 2 = F({b}) g([1,1]) = [1,1]T [-1,1] = 0 = F({a,b}) [-2,2] {b} {a,b} [-1,1] w=[0,1]want g(w) {} {a} Greedy ordering:e1 = b, e2 = a w(e1)=1 > w(e2)=0 xw(e1)=F({b})-F(;)=2 xw(e2)=F({b,a})-F({b})=-2 xw=[-2,2]
Why is this useful? Theorem [Lovasz ’83]:g(w) attains its minimum in [0,1]n at a corner! If we can minimize g on [0,1]n, can minimize F…(at corners, g and F take same values) g(w) convex (and efficient to evaluate) F(A) submodular Does the converse also hold? No, consider g(w1,w2,w3) = max(w1,w2+w3) {a} {b} {c} F({a,b})-F({a})=0 < F({a,b,c})-F({a,c})=1
Minimizing a submodular function Ellipsoid algorithm Interior Points algorithm
Y1 Y2 Y3 X1 X2 X3 Y4 Y5 Y6 X4 X5 X6 Y7 Y8 Y9 X7 X8 X9 Example: Image denoising Pairwise Markov Random Field P(x1,…,xn,y1,…,yn) = i,ji,j(yi,yj) ii(xi,yi) Wantargmaxy P(y | x) =argmaxy log P(x,y) =argminyi,j Ei,j(yi,yj)+i Ei(yi) Ei,j(yi,yj) = -log i,j(yi,yj) Xi: noisy pixels Yi: “true” pixels When is this MAP inference efficiently solvable(in high treewidth graphical models)?
MAP inference in Markov Random Fields[Kolmogorov et al, PAMI ’04, see also: Hammer, Ops Res ‘65]
Part II • Submodularity • Move making algorithms • Higher-order model : Pn Potts model
Move making expansions move and swap move for this problem
if the pairwise potential functions define a metric then the energy function in equation (8) can be approximately minimized using alpha expansions. • if pairwise potential functions defines a semi-metric, it can be minimized using alpha beta-swaps.
Move Energy • Each move: • A transformation function: • The energy of a move t: • The optimal move: Submodular set functions play an important role in energy minimization as they can be minimized in polynomial time
Higher order potential • The class of higher order clique potentials for which the expansion and swap moves can be computed in polynomial time The clique potential take the form:
Question you should be asking: • Show that move energy is submodularfor all xc Can my higher order potential be solved using α-expansions?
Moves for Higher Order Potentials • Form of the Higher Order Potentials Clique Inconsistency function: Pairwise potential: xj xi xk Sum Form c xm xl Max Form
Theoretical Results: Swap • Move energy is always submodular if non-decreasing concave. proofs
Condition for Swap move Concave Function:
Prove • all projections on two variables of any alpha beta-swap move energy are submodular. • The cost of any configuration
substitute Constraints 1: Lema 1: Constraints2: The theorem is true
Condition for alpha expansion • Metric:
Moves for Higher Order Potentials • Form of the Higher Order Potentials Clique Inconsistency function: Pairwise potential: xj xi xk Sum Form c xm xl Max Form
Part II • Submodularity • Move making algorithms • Higher-order model : Pn Potts model
Image Segmentation n = number of pixels E(X) = ∑ ci xi + ∑dij|xi-xj| E: {0,1}n→R 0 →fg, 1→bg i i,j Image Segmentation Unary Cost [Boykov and Jolly ‘ 01] [Blake et al. ‘04] [Rotheret al.`04]
Pn Potts Potentials Patch Dictionary (Tree) { 0 if xi = 0, i ϵ p Cmax otherwise h(Xp) = Cmax 0 p • [slide credits: Kohli]
Pn Potts Potentials n = number of pixels E: {0,1}n→R 0 →fg, 1→bg E(X) = ∑ ci xi+ ∑dij|xi-xj| +∑hp(Xp) i i,j p { 0 if xi = 0, i ϵ p Cmax otherwise h(Xp) = p • [slide credits: Kohli]
Theoretical Results: Expansion • Move energy is always submodular if increasing linear See paper for proofs