1 / 49

Semantic Segmentation with Second-Order Pooling

Semantic Segmentation with Second-Order Pooling. João Carreira 1,2 , Rui Caseiro 1 , Jorge Batista 1 , Cristian Sminchisescu 2. 1 Institute of Systems and Robotics , University of Coimbra. 2 Faculty of Mathematics and Natural Science, University of Bonn. Semantic Segmentation. B ottle.

Download Presentation

Semantic Segmentation with Second-Order Pooling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semantic Segmentation withSecond-Order Pooling João Carreira1,2, Rui Caseiro1, Jorge Batista1, Cristian Sminchisescu2 1Institute of Systems and Robotics, University of Coimbra 2Faculty of Mathematics and Natural Science, University of Bonn

  2. Semantic Segmentation Bottle Person Chair Example from Pascal VOC segmentation dataset

  3. Semantic Segmentation Our bottom-up pipeline: 1. Sample candidate object regions (figure-ground) 2. Region description and classification 3. Construct full image labeling from regions Li, Carreira, Sminchisescu, CVPR 2010, IJCV2011

  4. Semantic Segmentation Key: generate good object candidates, not superpixels CPMC: Constrained Parametric Min-Cuts for Automatic Object Segmentation, Carreira and Sminchisescu, CVPR 2010, PAMI 2012 Sample candidate object regions (figure-ground) Region description and classification Construct full image labeling from regions

  5. Semantic Segmentation Person Bottle Bottle Chair Sample candidate object regions (figure-ground) Region description and classification Construct full image labeling from regions

  6. Semantic Segmentation Person Bottle Chair Bottle Chair Sample candidate object regions (figure-ground) Region description and classification Construct full image labeling from regions

  7. Semantic Segmentation Person Person Bottle Chair Bottle Chair Sample candidate object regions (figure-ground) Region description and classification Construct full image labeling from regions

  8. Semantic Segmentation Person Person Bottle Bottle Chair Bottle Chair Sample candidate object regions (figure-ground) Region description and classification Construct full image labeling from regions

  9. Semantic Segmentation Person Person Bottle Bottle Chair Bottle Chair Sample candidate object regions Region description and classification Construct full image labeling from regions

  10. Semantic Segmentation Bottom-up formulation: 1. Sample candidate object regions (figure-ground) 2. Region description and classification 3. Construct full image labeling from regions Image Predicted Ground Truth

  11. Semantic Segmentation Bottom-up formulation: 1. Sample candidate object regions (figure-ground) 2. Region description and classification 3. Construct full image labeling from regions This work! Image Predicted Ground Truth

  12. Describing Free-form Regions Currently, most successful approaches use variations of Bag of Words (BOW) and HOG • Require expensive classifiers with non-linear kernels • Used in sliding-window detection and in image classification Are there descriptors better suited for regions (segments)?

  13. Aggregation-based Descriptors Repeat for each region Region descriptor Local Feature Extraction Coding Pooling

  14. Aggregation-based Descriptors Repeat for each region Local Feature Extraction Coding Pooling Dense local feature extraction All local features SIFT

  15. Aggregation-based Descriptors Repeat for each region Local Feature Extraction Coding Pooling Dense local feature extraction Codebook All local features Local feature encodings SIFT

  16. Aggregation-based Descriptors Repeat for each region Local Feature Extraction Coding Pooling Dense local feature extraction Codebook Region descriptor All local features Local feature encodings Summarize coded features inside region SIFT

  17. Aggregation-based Descriptors Repeat for each region Local Feature Extraction Coding Pooling Dense local feature extraction Codebook Region descriptor All local features Local feature encodings Summarize coded features inside region SIFT avg/max

  18. Aggregation-based Descriptors Local Feature Extraction Coding Pooling Descriptor Dense SIFT extraction Hard Vector Quantization Average Pooling Bag of Words Dense SIFT extraction Sparse Coding Max Pooling Yang et al 09

  19. Aggregation-based Descriptors Most research so far focused on coding Hard Vector Quantization, Kernel Codebook encoding, Sparse Coding, Fisher encoding, Locality-constrained Linear Coding... Pooling has received far less attention MaxAverage Sivic03, Csurka04, Philbin08, Gemert08, Yang09, Perronnin10, Wang10, (...) Given N local feature descriptorsextracted inside region

  20. Second-Order Pooling Can we pursue richer statistics for pooling ?

  21. Second-Order Pooling Can we pursue richer statistics for pooling ? =avg/max Given N local feature descriptorsextracted inside region

  22. Second-Order Pooling Can we pursue richer statistics for pooling ? Capture correlations =avg/max =avg/max Given N local feature descriptorsextracted inside region

  23. Second-Order Pooling Can we pursue richer statistics for pooling ? Dimensionality = (local descriptor size)2 Capture correlations Local Feature Extraction Coding Pooling

  24. Second-Order Pooling Can we pursue richer statistics for pooling ? Dimensionality = (local descriptor size)2 Capture correlations Bypass coding Local Feature Extraction Pooling

  25. Second-Order Pooling What can we say about these matrices ?

  26. Second-Order Pooling What can we say about these matrices ? ... so we can simply keep upper triangle Symmetric Symmetric

  27. Second-Order Pooling What can we say about these matrices ? SPD matrices have rich geometry: they form a Riemannian manifold Symmetric Positive Definite (SPD) Symmetric

  28. Second-Order Pooling What can we say about these matrices ? SPD matrices have rich geometry: they form a Riemannian manifold • Linear classifiers ignore this additional geometry Symmetric Positive Definite (SPD) Symmetric

  29. Usual solution is to flatten the manifold by projecting to local tangent spaces Only valid in a local neighborhood, in general Embedding SPD Manifold in Euclidean Space

  30. Usual solution is to flatten the manifold by projecting to local tangent spaces Only valid in a local neighborhood, in general By using special, Log-Euclidean metric, it is possible to directly embed entire manifold (Arsigny et al. 07) Embedding SPD Manifold in an Euclidean Space

  31. Sequence of Operations 1. Second-Order Avg Pooling: Second-Order Max Pooling: 2. Select upper triangle and convert to vector 3. Power normalize (Perronnin et al 2010) • , with

  32. Sequence of Operations 1. Second-Order Avg Pooling: Second-Order Max Pooling: 2. Select upper triangle and convert to vector 3. Power normalize Feed resulting descriptor to linear classifier

  33. Additionally we use better local descriptors with pooling methods SIFT

  34. Local Feature Enrichment (1/2)Relative Position w s h (x,y) SIFT Relative position

  35. Local Feature Enrichment (2/2)Pixel Color Relative position RGB LAB HSV SIFT Concatenate eSIFT

  36. Region Classification VOC 2011 Ground truth regions

  37. Region Classification VOC 2011 Ground truth regions Linear classification accuracy HOG: 41.79%

  38. Region Classification VOC 2011 Ground truth regions Linear classification accuracy =max/avg HOG: 41.79%

  39. Region Classification VOC2011 Ground truth regions Linear classification accuracy = max/avg HOG: 41.79%

  40. Region Classification VOC 2011 Ground truth regions Linear classification accuracy HOG: 41.79%

  41. Semantic Segmentation in the Wild Pascal VOC 2011 CPMC Thresholding + reverse score overlaying Local Descriptors: - eSIFT - Masked eSIFT - eLBP Region Descriptors: - Second-Order Average Pooling Learning: - LIBLINEAR

  42. Semantic Segmentation in the Wild Pascal VOC 2011 This work!! comp6 comp5 O2P best on 13 out of 21 categories: background, aeroplane, boat, bus, motorbike, car, train, cat, dog, horse, potted plant, sofa, person

  43. Semantic Segmentation in the Wild Pascal VOC 2011 comp6 comp5 Linear Exp-Chi2 kernels

  44. Semantic Segmentation in the Wild Pascal VOC 2011 comp6 comp5 Linear Exp-Chi2 kernels 130x faster 20,000x faster

  45. Caltech 101 Important testbed for coding and pooling techniques

  46. Caltech 101 Important testbed for coding and pooling techniques • No segments, spatial pyramid instead • Linear classification

  47. Caltech 101 Important testbed for coding and pooling techniques Lazebnik et al. ‘06 Wang et al. ‘10 Bo & Sminchisescu ’10 Boureau et al. ‘11

  48. Conclusions Second-order pooling with Log-Euclidean tangent space mappings Practical aggregation-based descriptors without unsupervised learning stage (no codebooks) High recognition performance on free-form regions using linear classifiers Semantic Segmentation on VOC 2011 superior to state-of-the-art with models 20,000x faster Code available online

  49. Thank you!

More Related