210 likes | 572 Views
Efficient Inference for Fully-Connected CRFs with Stationarity. Yimeng Zhang, Tsuhan Chen CVPR 2012. Summary. Explore object-class segmentation with fully-connected CRF models Only restriction on pairwise terms is `spatial stationarity ’ (i.e. depend on relative locations)
E N D
Efficient Inference for Fully-Connected CRFs with Stationarity Yimeng Zhang, Tsuhan Chen CVPR 2012
Summary • Explore object-class segmentation with fully-connected CRF models • Only restriction on pairwise terms is `spatial stationarity’ (i.e. depend on relative locations) • Show how efficient inference can be achieved by • Using a QP formulation • Using FFT to calculate gradients in complexity (linear in) O(NlogN)
Fully-connected CRF model • General pairwise CRF model: • Image I • Class labeling, X: • Label set, L: • V = set of pixels, N_i = neighbourhood of pixel i, Z(I) = partition function, psi = potential functions
Fully-connected CRF model • General pairwise CRF model: • In fully-connected CRF, for all i, N_i = V
Unary Potential • Unary potential generates a score for each object class per pixel (TextonBoost)
Pairwise Potential • Pairwise potential measures compatibility of the labels at each pair of pixels • Combines spatial and colour contrast factors
Pairwise Potential • Colour contrast: • Spatial term:
Pairwise Potential • Learning the spatial term
MAP inference using QP relaxation • Introduce a binary indicator variable for each pixel and label • MAP inference expressed as a quadratic integer program, and relaxed to give the QP
MAP inference using QP relaxation • QP relaxation has been proved to be tight in all cases (Ravikumar ICML 2006 [24]) • Moreover, it is convex whenever matrix of edge-weights is negative-definite • Additive bound for non-convex case • QP requires O(KN) variables, LP requires (K^2E)
MAP inference using QP relaxation • Gradient • Derive fixed-point update by forming Lagrangian and setting its derivative to 0
Efficiently evaluating the gradient • Required summation • Would be a convolution without the color term • With color term is requires 5D-filtering • Can be approximated by clustering into Ccolor clusters, => C convolutions across
Efficiently evaluating the gradient • Hence, for the case x_i = x_j, we need to evaluate • Instead, evaluate for C clusters (C = 10 to 15) • where • Finally, interpolate
Update complexity • FFTs of each spatial filters can be calculated in advance (K^2 filters) • At each update, we require C FFTs calculating, O(CNlogN) • K^2 convolutions are needed, each requiring a multiplication, O(K^2CN) • Terms can be added in Fourier domain, => only KC inverse FFTs needed, O(KCNlogN) • Run-time per iteration < 0.1s for 213x320 pixels (+ downsampling by factor of 5)
MSRC synthetic experiment • Unary terms randomized • Spatial distributions set to ground-truth
MSRC synthetic experiment • Running times
MSRC full experiment • Use TextonBoost unary potentials • Compare with several other CRFs with same unaries • Grid only • Grid + P^N (Kohli, CVPR 2008) • Grid + P^N + Cooccurrence (Ladickỳ, ECCV 2010) • Fully-connected + Gaussian spatial (Krähenbühl, NIPS 2011)
MSRC full experiment • Qualitative comparison
MSRC full experiment • Quantitative comparison • Overall • Per-class • Timing: 2-8s per image