620 likes | 707 Views
Learning a Scale-Invariant Model for Curvilinear Continuity. Xiaofeng Ren. The Quest of Boundary Detection. Widely used for mid/high-level vision tasks Huge literature on edge detection [Canny 86] Typically measuring local contrast Approaching human performance?
E N D
Learning a Scale-Invariant Model for Curvilinear Continuity Xiaofeng Ren 1
The Quest of Boundary Detection • Widely used for mid/high-level vision tasks • Huge literature on edge detection [Canny 86] • Typically measuring local contrast • Approaching human performance? [Martin, Fowlkes & Malik 02] [Fowlkes, Martin & Malik 03] 2
Limit of Local Boundary Detection 1 2 3 4 3
Curvilinear Continuity • Good Continuation • Visual Completion • Illusory Contours 4
Continuity in Human Vision • [Wertheimer 23] • [Kanizsa 55] • [von der Heydt et al 84] • evidence in V2 • [Kellman & Shipley 91] • geometric conditions of completion • [Field, Hayes & Hess 93] • quantitative analysis of factors • [Kapadia, Westheimer & Gilbert 00] • evidence in V1 • [Geisler et al 01] • evidence from ecological statistics … … … … 5
Extensive literature on curvilinear continuity [Shashua & Ullman 88], [Parent & Zucker 89], [Heitger & von der Heydt 93], [Mumford 94], [Williams & Jacobs 95], [Elder & Zucker 96], [Williams & Thornber 99], [Jermyn & Ishikawa 99], [Mahamud et al 03], …, … Problems with most of the previous approaches no support from any groundtruth data usually demonstrated on a few simple/synthetic images no quantitative evaluation Continuity in Computer Vision 6
Outline • Ecological Statistics of Contours • A Scale-Invariant Representation • Learning Models of Curvilinear Continuity • Quantitative Evaluation • Discussion and Future Work 7
Outline • Ecological Statistics of Contours • Groundtruth boundary contours • Power law in contours • A multi-scale Markov model • A Scale-Invariant Representation • Learning Models of Curvilinear Continuity • Quantitative Evaluation • Discussion and Future Work 8
Human-Segmented Natural Images [Martin et al, ICCV 2001] 1,000 images, >14,000 segmentations 9
t(s+1) s+1 t(s) s Contour Geometry • First-Order Markov Model [Mumford 94, Williams & Jacobs 95] • Curvature: white noise ( independent from position to position ) • Tangent t(s): random walk • Markov assumption: the tangent at the next position, t(s+1), only depends on the current tangent t(s) 10
t(s+1) s+1 t(s) s Contours are Smooth P( t(s+1) | t(s) ) marginal distribution of tangent change 11
Testing the Markov Assumption Segment the contours at high-curvature positions 12
Prediction: Exponential Distribution If the first-order Markov assumption holds… • At every step, there is a constant probability p that a high curvature event will occur • High curvature events are independent from step to step Let L be the length of a segment between high-curvature points • P( L>=k ) = (1-p)k • P( L=k ) = p(1-p)k L has an exponential distribution 13
Empirical Distribution: Power Law Probability Contour segment length L 14
Power Laws in Nature • Power laws widely exist in nature • Brightness of stars • Magnitude of earthquakes • Population of cities • Word frequency in natural languages • Revenue of commercial corporations • Connectivity in Internet topology … … • Usually characterized by self-similarity and scale-invariant phenomena 15
t(1)(s+1) s+1 • Coarse-to-fine contour completion • [Ren & Malik 02] Multi-scale Markov Models t(s+1) • Assume knowledge of contour orientation at coarser scales s+1 2nd Order Markov: P( t(s+1) | t(s) , t(1)(s+1) ) Higher Order Models: P( t(s+1) | t(s) , t(1)(s+1), t(2)(s+1), … ) t(s) s 16
Contour Synthesis First-Order Markov: P( t(s+1) | t(s) ) Multi-scale Markov: P( t(s+1) | t(s) , t(1)(s+1), t(2)(s+1), … ) [Ren & Malik 02] 17
Outline • Ecological Statistics of Contours • A Scale-Invariant Representation • Piecewise linear approximation • Constrained Delaunay Triangulation • Learning Models of Curvilinear Continuity • Quantitative Evaluation • Discussion and Future Work 18
Use Pb (probability of boundary) as input Combining local brightness, texture and color cues Trained from human-marked segmentation boundaries Outperform existing local boundary detectors including Canny Local “Probability of Boundary” • [Martin, Fowlkes & Malik 02] 19
Threshold Pb and find connected boundary pixels Recursively split the boundaries until each piece is approximately straight b b a c a c Split at C Piecewise Linear Approximation minimize 20
Standard in computational geometry Dual of the Voronoi Diagram Unique triangulation that maximizes the minimum angle avoiding long skinny triangles Efficient and simple randomized algorithm Delaunay Triangulation 21
A variant of the standard Delaunay Triangulation Keeps a given set of edges in the triangulation [Chew 87] [Shewchuk 96] • Still maximizes the minimum angle • Widely used in geometric modeling and finite elements Constrained Delaunay Triangulation 22
A typical scenario of contour completion high contrast high contrast low contrast • CDT picks the “right” edge, completing the gap The “Gap-filling” Property of CDT 23
Examples Image Pb CDT 24
Black: gradient edges or G-edges Green: completed edges or C-edges Examples 25
Outline • Ecological Statistics of Contours • A Scale-Invariant Representation • Learning Models of Curvilinear Continuity • Transferring Groundtruth to CDT • A simple model of local continuity • A global model w/ Conditional Random Fields • Quantitative Evaluation • Discussion and Future Work 26
Transferring Groundtruth to CDT • Human-marked boundaries are given on the pixel-grid • Label the CDT edges by bipartite matching d distance threshold d in matching CDT edges Phuman: percentage of pixels matched to groundtruth human-marked boundaries 27
pb1, G1 pb0, G0 Model for Continuity • Goal: define a continuity-enhanced Pb on CDT edges • Consider a pair of adjacent edges in CDT: • Each edge has an associated set of features • average Pb over the pixels belonging to this edge • indicator G, gradient edge or completed edge? • Continuity: angle “bi-gram” 28
Binary Classification • Assuming contours are always closed: each vertex in the CDT graph is adjacent to either zero or two true boundary edges • A binary classification problem: (0,0) or (1,1) “bi-gram” 29
pb1, G1 pb0, G0 Learning Local Continuity • Binary classification: (0,0) or (1,1) • Transferred Groundtruth labels on CDT edges • Features: • average Pb • (G0*G1): both are gradient edges? • angle • Logistic regression 30
PbL: Pb + Local Continuity Evidence of continuity comes from both ends pb1, G1 pb2, G2 1 2 pb0, G0 take max. over all possible pairs L L = PbL 31
Variants of the Local Model • More variants of the local model • alternative classifiers ( SVM, HME, … ) • 4-way classification • additional features • learning a 3-edge (tri-gram) model • learning how to combine evidence from both ends • No significant improvement in performance 32
Local inference Xi+1 Xi A Global Model of Continuity? X={X1,X2,…,Xm} Global inference incorporating all local continuity information? 33
For each edge i, define a set of features {g1,g2,…,gh} Potential function exp(i)at edge i For each junction j, define a set of features {f1,f2,…,fk} Potential function exp(j)at juncion j Conditional Random Fields X={X1,X2,…,Xm} [Pietra, Pietra & Lafferty 97] [Lafferty, McCallum & Pereira 01] 34
Conditional Random Fields Potential function on edges {exp(i)} Potential function on junctions {exp(j)} This defines a probability distribution over X: X={X1,X2,…,Xm} where Estimate P(Xi|) 35
Buliding a CRF Model • What are the features? • edge features are easy: Pb, G • junction features: type and continuity • How to make inference? • How to learn the parameters? X={X1,X2,…,Xm} Estimate P(Xi|) 36
degg=0,degc=2 degg=0,degc=2 Junction Features in CRF • Junction types (degg,degc): degg=1,degc=0 degg=0,degc=2 degg=1,degc=2 • Continuity term for degree-2 junctions 37
Inference w/ Belief Propagation Fr • Belief Propagation • Xi: state of the node (edge) i • Fq: state of the factor (junction) q • potentials on Xi,Xj,Xk, Fq={Xi, Xj, Xk} • want to compute PbG=P(Xi) • mqi: “belief” about Xi from Fq Xj mjq mir mqi Xi Fq mkq Xk • The CDT graph has many loops in it 38
Inference w/ Loopy Belief Propagation • Loopy Belief Propagation • just like belief propagation • iterates message passing until convergence • lack of theoretical foundations and known to have convergence issues • however becoming popular in practice • typically applied on pixel-grid • Works well on CDT graphs • converges fast • produces empirically sound results [Berrou 93], [Freeman 98], [Murphy 99], [Weiss 97,99,01] 39
Learning the Parameters • Maximum-likelihood estimation in CRF Let denote the groundtruth labeling on the CDT graph • Many possible optimization techniques • gradient descent, iterative scaling, conjugate gradient, … • Gradient descent works well 40
there are more non-boundary edges than boundary edges a continuation is better than a line-ending junctions are rare G-edges are better for continuation than C-edges Interpreting the Parameters • The junction parameters (degg,degc) on the horse dataset: (0,0)= 2.8318 (1,0)= 1.1279 (2,0)= 1.3774 (3,0)= 0.0342 (2,0)= 1.3774 (1,1)= -0.6106 (0,2)= -0.9773 41
Outline • Ecological Statistics of Contours • A Scale-Invariant Representation • Learning Models of Curvilinear Continuity • Quantitative Evaluation • The precision-recall framework • Experimental results on three datasets • Discussion and Future Work 42
Datasets • Baseball player dataset [Mori et al 04] • 30 news photos of baseball players in various poses, 15 training and 15 testing • Horse dataset [Borenstein & Ullman 02] • 350 images of standing horses facing left, 175 training and 175 testing • Berkeley Segmentation Dataset [Martin et al 01] • 300 Corel images of various natural scenes and ~2500 segmentations, 200 training and 100 testing 43
Evaluating Boundary Operators • Precision-Recall Curves [Martin, Fowlkes & Malik 02] • threshold the output boundary map • bipartite matching with the groundtruth m pixels on human-marked boundaries k matched pairs n detected pixels above a given threshold Precision = k/n, percentage of true positives Recall = k/m, percentage of groundtruth being detected • Project CDT edges back to the pixel-grid 44
Use Phuman the soft groundtruth label defined on CDT graphs: precision close to 100% Pb averaged over CDT edges: no worse than the orignal Pb No Loss of Structure in CDT 45
Continuity improves boundary detection in both low-recall and high-recall ranges Global inference helps; mostly in low-recall/high-precision Roughly speaking, CRF>Local>CDT only>Pb 46
Image Pb Local Global 49
Image Pb Local Global 50