1 / 88

O BJ C UT & Pose Cut CVPR 05 ECCV 06

UNIVERSITY OF OXFORD. O BJ C UT & Pose Cut CVPR 05 ECCV 06. Philip Torr M. Pawan Kumar, Pushmeet Kohli and Andrew Zisserman. Conclusion. Combining pose inference and segmentation worth investigating. (tommorrow) Tracking = Detection Detection = Segmentation

khuyen
Download Presentation

O BJ C UT & Pose Cut CVPR 05 ECCV 06

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. UNIVERSITY OF OXFORD OBJ CUT & Pose CutCVPR 05ECCV 06 Philip Torr M. Pawan Kumar, Pushmeet Kohli and Andrew Zisserman

  2. Conclusion • Combining pose inference and segmentation worth investigating. (tommorrow) • Tracking = Detection • Detection = Segmentation • Tracking (pose estimation) = Segmentation.

  3. First segmentation problem Segmentation • To distinguish cow and horse?

  4. Aim • Given an image, to segment the object Object Category Model Segmentation Cow Image Segmented Cow • Segmentation should (ideally) be • shaped like the object e.g. cow-like • obtained efficiently in an unsupervised manner • able to handle self-occlusion

  5. Challenges Intra-Class Shape Variability Intra-Class Appearance Variability Self Occlusion

  6. Motivation Magic Wand • Current methods require user intervention • Object and background seed pixels (Boykov and Jolly, ICCV 01) • Bounding Box of object (Rother et al. SIGGRAPH 04) Object Seed Pixels Cow Image

  7. Motivation Magic Wand • Current methods require user intervention • Object and background seed pixels (Boykov and Jolly, ICCV 01) • Bounding Box of object (Rother et al. SIGGRAPH 04) Object Seed Pixels Background Seed Pixels Cow Image

  8. Motivation Magic Wand • Current methods require user intervention • Object and background seed pixels (Boykov and Jolly, ICCV 01) • Bounding Box of object (Rother et al. SIGGRAPH 04) Segmented Image

  9. Motivation Magic Wand • Current methods require user intervention • Object and background seed pixels (Boykov and Jolly, ICCV 01) • Bounding Box of object (Rother et al. SIGGRAPH 04) Object Seed Pixels Background Seed Pixels Cow Image

  10. Motivation Magic Wand • Current methods require user intervention • Object and background seed pixels (Boykov and Jolly, ICCV 01) • Bounding Box of object (Rother et al. SIGGRAPH 04) Segmented Image

  11. Motivation • Problem • Manually intensive • Segmentation is not guaranteed to be ‘object-like’ Non Object-like Segmentation

  12. Our Method • Combine object detection with segmentation • Borenstein and Ullman, ECCV ’02 • Leibe and Schiele, BMVC ’03 • Incorporate global shape priors in MRF • Detection provides • Object Localization • Global shape priors • Automatically segments the object • Note our method completely generic • Applicable to any object category model

  13. Outline • Problem Formulation • Form of Shape Prior • Optimization • Results

  14. Problem • Labelling m over the set of pixels D • Shape prior provided by parameter Θ • Energy E (m,Θ) = ∑Φx(D|mx)+Φx(mx|Θ) + ∑ Ψxy(mx,my)+ Φ(D|mx,my) • Unary terms • Likelihood based on colour • Unary potential based on distance from Θ • Pairwise terms • Prior • Contrast term • Find best labelling m* = arg min ∑ wi E (m,Θi) • wi is the weight for sample Θi Unary terms Pairwise terms

  15. MRF • Probability for a labellingconsists of • Likelihood • Unary potential based on colour of pixel • Prior which favours same labels for neighbours (pairwise potentials) mx m(labels) Prior Ψxy(mx,my) my Unary Potential Φx(D|mx) x y D(pixels) Image Plane

  16. Example Cow Image Object Seed Pixels Background Seed Pixels Φx(D|obj) x … x … Φx(D|bkg) Ψxy(mx,my) y … y … … … … … Prior Likelihood Ratio (Colour)

  17. Example Cow Image Object Seed Pixels Background Seed Pixels Prior Likelihood Ratio (Colour)

  18. Contrast-Dependent MRF • Probability of labelling in addition has • Contrast term which favours boundaries to lie on image edges mx m(labels) my x Contrast Term Φ(D|mx,my) y D(pixels) Image Plane

  19. Example Cow Image Object Seed Pixels Background Seed Pixels Φx(D|obj) x … x … Φx(D|bkg) Ψxy(mx,my)+ Φ(D|mx,my) y … y … … … … … Prior + Contrast Likelihood Ratio (Colour)

  20. Example Cow Image Object Seed Pixels Background Seed Pixels Prior + Contrast Likelihood Ratio (Colour)

  21. Our Model • Probability of labelling in addition has • Unary potential which depend on distance from Θ (shape parameter) Θ (shape parameter) Unary Potential Φx(mx|Θ) mx m(labels) my Object Category Specific MRF x y D(pixels) Image Plane

  22. Example Cow Image Object Seed Pixels Background Seed Pixels ShapePriorΘ Prior + Contrast Distance from Θ

  23. Example Cow Image Object Seed Pixels Background Seed Pixels ShapePriorΘ Prior + Contrast Likelihood + Distance from Θ

  24. Example Cow Image Object Seed Pixels Background Seed Pixels ShapePriorΘ Prior + Contrast Likelihood + Distance from Θ

  25. Outline • Problem Formulation • E (m,Θ) = ∑Φx(D|mx)+Φx(mx|Θ) + ∑ Ψxy(mx,my)+ Φ(D|mx,my) • Form of Shape Prior • Optimization • Results

  26. Detection • BMVC 2004

  27. Layered Pictorial Structures (LPS) • Generative model • Composition of parts + spatial layout Layer 2 Spatial Layout (Pairwise Configuration) Layer 1 Parts in Layer 2 can occlude parts in Layer 1

  28. Layered Pictorial Structures (LPS) Cow Instance Layer 2 Transformations Θ1 P(Θ1) = 0.9 Layer 1

  29. Layered Pictorial Structures (LPS) Cow Instance Layer 2 Transformations Θ2 P(Θ2) = 0.8 Layer 1

  30. Layered Pictorial Structures (LPS) Unlikely Instance Layer 2 Transformations Θ3 P(Θ3) = 0.01 Layer 1

  31. How to learn LPS • From video via motion segmentation see Kumar Torr and Zisserman ICCV 2005.

  32. LPS for Detection • Learning • Learnt automatically using a set of examples • Detection • Matches LPS to image using Loopy Belief Propagation • Localizes object parts

  33. Detection • Like a proposal process.

  34. Pictorial Structures (PS) Fischler and Eschlager. 1973 PS = 2D Parts + Configuration Aim: Learn pictorial structures in an unsupervised manner Layered Pictorial Structures (LPS) Parts + Configuration + Relative depth • Identify parts • Learn configuration • Learn relative depth of parts

  35. Pictorial Structures • Each parts is a variable • States are image locations • AND affine deformation Affine warp of parts

  36. Pictorial Structures • Each parts is a variable • States are image locations • MRF favours certain • configurations

  37. Bayesian Formulation (MRF) • D = image. • Di = pixels Є pi , given li • (PDF Projection Theorem. ) z = sufficient statistics • ψ(li,lj) = const, if valid configuration = 0, otherwise. Pott’s model

  38. Defining the likelihood • We want a likelihood that can combine both the outline and the interior appearance of a part. • Define features which will be sufficient statistics to discriminate foreground and background:

  39. Features • Outline: z1 Chamfer distance • Interior: z2 Textons • Model joint distribution of z1 z2 as a 2D Gaussian.

  40. Chamfer Match Score • Outline (z1) : minimum chamfer distances over multiple outline exemplars • dcham= 1/n Σi min{ minj ||ui-vj ||, τ } Image Edge Image Distance Transform

  41. Texton Match Score • Texture(z2) : MRF classifier • (Varma and Zisserman, CVPR ’03) • Multiple texture exemplars x of class t • Textons: 3 X 3 square neighbourhood • VQ in texton space • Descriptor: histogram of texton labelling • χ2 distance

  42. Bag of Words/Histogram of Textons • Having slagged off BoW’s I reveal we used it all along, no big deal. • So this is like a spatially aware bag of words model… • Using a spatially flexible set of templates to work out our bag of words.

  43. 2. Fitting the Model • Cascades of classifiers • Efficient likelihood evaluation • Solving MRF • LBP, use fast algorithm • GBP if LBP doesn’t converge • Could use Semi Definite Programming (2003) • Recent work second order cone programming method best CVPR 2006.

  44. Efficient Detection of parts • Cascade of classifiers • Top level use chamfer and distance transform for efficient pre filtering • At lower level use full texture model for verification, using efficient nearest neighbour speed ups.

  45. Cascade of Classifiers-for each part • Y. Amit, and D. Geman, 97?; S. Baker, S. Nayer 95

  46. High Levels based on Outline (x,y)

  47. Side Note • Chamfer like linear classifier on distance transform image Felzenszwalb. • Tree is a set of linear classifiers. • Pictorial structure is a parameterized family of linear classifiers.

  48. Low levels on Texture • The top levels of the tree use outline to eliminate patches of the image. • Efficiency: Using chamfer distance and pre computed distance map. • Remaining candidates evaluated using full texture model.

  49. Efficient Nearest Neighbour • Goldstein, Platt and Burges (MSR Tech Report, 2003) Conversion from fixed distance to rectangle search • bitvectorij(Rk) = 1 • = 0 • Nearest neighbour of x • Find intervals in all dimensions • ‘AND’ appropriate bitvectors • Nearest neighbour search on • pruned exemplars RkЄ Ii in dimension j

  50. Recently solve via Integer Programming • SDP formulation (Torr 2001, AI stats) • SOCP formulation (Kumar, Torr & Zisserman this conference) • LBP (Huttenlocher, many)

More Related