501 likes | 717 Views
Filter-based Mean-Field Inference for Random Fields with Higher-Order Terms and Product Label-Spaces. Vibhav Vineet*, Jonathan Warrell*, Philip H.S. Torr. http://cms.brookes.ac.uk/research/visiongroup/. *Joint first authors. Labelling problems.
E N D
Filter-based Mean-Field Inference for Random Fields with Higher-Order Terms and Product Label-Spaces Vibhav Vineet*, Jonathan Warrell*, Philip H.S. Torr http://cms.brookes.ac.uk/research/visiongroup/ *Joint first authors
Labelling problems Many vision problems can be expressed as dense image labelling problems Object segmentation Stereo Optical flow
Overview • Graph cuts so far have proved the method of choice for CRFs
Overview • Graph cuts so far have proved the method of choice for CRFs • Recently message passing methods have started to achieve equal performance with much faster run times • But only for pairwise CRFs
Overview • Graph cuts so far have proved the method of choice for CRFs • Recently message passing methods have started to achieve equal performance with much faster run times • But only for pairwise CRFs • Some problems require higher order information • Co-occurrences terms • Product label spaces
Overview • Graph cuts so far have proved the method of choice for CRFs • Recently message passing methods have started to achieve equal performance with much faster run times • But only for pairwise CRFs • Some problems require higher order information • Co-occurrences terms • Product label spaces • Our contribution is to develop fast message passing based methods for certain classes of higher order information
Importance of co-occurrence terms Context is an important cue for global scene understanding Can you identify this object? Slide courtesy A Torralba
Importance of co-occurrence terms We can identify it as keyboard through scene context Slide courtesy A Torralba
Importance of co-occurrence terms The keyboard, table and monitor often co-occur together Shown to improve accuracy recently in Ladický et al (ECCV ’10) Slide courtesy A Torralba
Importance of PN Potts terms • PN Potts enforce region consistency • Detector-based PN potentials are formed by applying grab-cut to bounding box to create a clique • Improves over pairwise terms only Result without detections Set of detections Final Result Slide courtesy L Ladicky
Importance of higher order terms We use higher order information to improve object class segmentation … Object labels Image
Importance of higher order terms … and also to improve joint object and stereo labelling using product label spaces Object labels Image Disparity labels
CRF formulation Standard CRF energy formulation Pairwise CRF Inference Data term Smoothness term
CRF formulation Standard CRF energy formulation Higher Order CRF Inference Co-occurrence term Data term Higher order terms Smoothness term
Inference Standard CRF energy Co-occurrence term Data term Higher order term Smoothness term Can be solved using graph-cuts based method But with co-occurrence ~10 times slower than pairwise only Relatively fast but still computationally expensive!
Our inference Standard CRF energy Co-occurrence term Data term Higher order term Smoothness term We use filter-based mean-field inference approach Our method achieves almost 10-40 times speed up compared to graph cuts based methods Much faster due to efficient filtering
Efficient inference in pairwise CRF • Krähenbühl et al (NIPS ’11) propose an efficient method for inference in pairwise CRF under two assumptions: • Mean-field approximation to CRF • Pairwise weights take a linear combination of Gaussian kernels
Efficient inference in pairwise CRF • Krähenbühl et al (NIPS ’11) propose an efficient method for inference in pairwise CRF under two assumptions: • Mean-field approximation to CRF • Pairwise weights take a linear combination of Gaussian kernels • They achieve almost 5 times speed up over graph cuts + also allow dense connectivity Fully connected (dense) pairwise CRF Inference Slide courtesy P Krahenbuhl
Mean-field based inference • Mean-field approximation • approximate intractable P with Q from a tractable family • Minimize the KL-divergence between Q and P Slide courtesy S Nowozin
Mean-field based inference • Mean-field update for pairwise terms:
Mean-field based inference • Mean-field update for pairwise terms: • This can be evaluated using Gaussian convolutions
Mean-field based inference • Mean-field update for pairwise terms: • This can be evaluated using Gaussian convolutions • We evaluate two approaches for Gaussian convolution • **Permutohedral lattice based filtering • ***Domain transform based filtering **Adams et.al. Fast high-dimensional filtering using the permutohedral lattice. CG-10 ***Gasta et.al. Domain transform for edge-aware image and video processing. TOG-11
Q distribution Q distribution for different classes across different iterations Iteration 0 0 0.5 0.3 0.4 0.6 0.7 0.8 0.9 0.1 0.2 1
Q distribution Q distribution for different classes across different iterations Iteration 1 0 0.5 0.3 0.4 0.6 0.7 0.8 0.9 0.1 0.2 1
Q distribution Q distribution for different classes across different iterations Iteration 2 0 0.5 0.3 0.4 0.6 0.7 0.8 0.9 0.1 0.2 1
Q distribution Q distribution for different classes across different iterations Iteration 10 0 0.5 0.3 0.4 0.6 0.7 0.8 0.9 0.1 0.2 1
Higher order mean-field update • Marginal update in mean-field
Higher order mean-field update • Marginal update in mean-field - - - - - - - - - Labels: = 1 = 2 = 3
Higher order mean-field update • Marginal update in mean-field • High time complexity for general higher order terms: O(L|C|) We show how these can be solved for PN Potts and co-occurrence terms efficiently
PN Potts example PN Potts enforces region consistent labellings Label set consists of 3 labels Potts patterns Clique of 6 variables Example: Detector potentials
Expectation update Sum across possible states of the clique Clique does not taking label l Clique takes label l By rearranging the expectation as above, we reduce the time complexity from O(LN) to O(NL) Can be extended to pattern-based potentials (Komodakis et al CVPR ’09)
Global co-occurrence terms Co-occurrence models which objects belong together Λ(x)={ aeroplane, tree, flower, building, boat, grass, sky } Λ(x)={ building, tree, grass, sky }
Global co-occurrence terms Associates a cost with each possible label subset ={ , , }
Global co-occurrence terms Associates a cost with each possible label subset ={ , , } We use a second order assumption to cost function
Our model We define a cost over a set of latent variables: Y{1…L} Each latent variable represents a label Y: Costs include unary and pairwise cost Each latent variable node is connected to each image variable node K Latent variable binary states: X: :off :on
Global co-occurrence constraints Constraint on the model Constraint violation Y: K K X: If latent variable is off, no image variable should take that label Overall complexity: O(NL+L2) Pay cost K for each violation
Product label space Assign an object and disparity label to each pixel Joint energy function defined over product label space: data term smoothness term higher order term Inference in product label space Object Class Segmentation Left Camera Image Dense Stereo Reconstruction Right Camera Image
PascalVOC-10 dataset - qualitative Image Ground truth Ours alpha-expansion** Fully connected pairwise CRF* Observe an improvement over alternative methods *Krahenbuhl et al. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials, NIPS 11 **Ladicky L. et.al. Graph cut based inference with co-occurrence statistics, ECCV-10
PascalVOC - quantitative Observe an improvement of 2.3% in I/U score over Ladicky et.al* *Ladicky L. et.al. Graph cut based inference with co-occurrence statistics, ECCV-10
PascalVOC - quantitative Observe an improvement of 2.3% in I/U score over Ladicky et.al* Achieve 8-9x speed up compared to alpha-expansion based method of Ladicky et.al* *Ladicky L. et.al. Graph cut based inference with co-occurrence statistics, ECCV-10
Leuven dataset - qualitative Right image Left image Ours Ground truth Ours Ground truth
Leuven dataset - quantitative Achieve 12-35x speed up compared to alpha-expansion based method of Ladicky et.al* *Ladicky L. et.al. Joint optimisation for object class segmentation an dense stereo reconstruction. BMVC-2010
Conclusion • We provide efficient ways of incorporating higher-order terms into fully connected pairwise CRF models • Demonstrate improved efficiency compared to previous models with higher-order terms • Also demonstrate improved accuracy over previous approaches • Similar methods applicable to a broad range of vision problems • Code is available for download: • http://cms.brookes.ac.uk/staff/VibhavVineet/
Joint object-stereo model Introduce two different set of variables Yi: disparity variable Xi: object variable Z_i: [ x_i y_i ] Messages exchanged between object and stereo variables Joint energy function: Unary Pairwise Higher order
Marginal update for object variables Message from disparity variables to object variables Filtering is done using permutohedral lattice based filtering* strategy *Adams A. et.al. Fast high-dimensional filtering using the permutohedral lattice. Computer Graphics Forum-2010
Marginal update for disparity variables Message from object variables to disparity variables Filtering is done using domain transform based filtering* strategy *Gasta E.S.L. et.al. Domain transform for edge-aware image and video processing. ACM Trans. Graph.-2011
Mean-field Vs. Graph-cuts • Measure I/U score on PascalVOC-10 segmentation • Increase standard deviation for mean-field • Increase window size for graph-cuts method • Both achieve almost similar accuracy
Mean-field Vs. Graph-cuts • Measure I/U score on PascalVOC-10 segmentation • Increase standard deviation for mean-field • Increase window size for graph-cuts method • Time complexity very high, making infeasible to work with large neighbourhood system
Window sizes • Comparison on matched energy Impact of adding more complex costs and increasing window size