10 likes | 104 Views
0.4. 0.23. 0.23. 0.27. 0.13. 0.13. 0.34. 0.34. 0.23. 0.25. 0.23. 0.24. 0.23. 0.28. 0.29. 0.23. 0.33. 0.33. 0.38. 0.38. 0.23. 0.35. 0. 0. 0.23. 0.26. 0.3. 0.3. 0.18. 0.18. 0.29. 0.23. 0.23. 0.23. PCST instance. PCST instance. Point descriptors. Shape descriptors.
E N D
0.4 0.23 0.23 0.27 0.13 0.13 0.34 0.34 0.23 0.25 0.23 0.24 0.23 0.28 0.29 0.23 0.33 0.33 0.38 0.38 0.23 0.35 0 0 0.23 0.26 0.3 0.3 0.18 0.18 0.29 0.23 0.23 0.23 PCST instance PCST instance Point descriptors Shape descriptors Bag of features SVM - negative features, - positive features - neg features - pos features Computation Time Comparison with CRF Comparison with ESS 0.27 0.13 0.34 0.24 0.28 0.33 0.38 0.26 0.3 Best-scoring region Branch-and-cut solution Efficient Region Search for Object Detection Sudheendra Vijayanarasimhanand Kristen Grauman Department of Computer Science, University of Texas at Austin Efficient Region Search (ERS) Motivation Results (code available @ http://vision.cs.utexas.edu/projects/ers/ers-code.tar.gz) • Baselines • Efficient Subwindow Search (ESS) [Lampert et al. 2008] • Global connectivity CRF [Nowozin et al. 2009] • Evaluation metrics • Pixel-level AP, PASCAL bounding box metric, overlap scores Region-graph Object detection via exhaustive search is too expensive. Branch-and-bound schemes can limit the search (Lampert et al. ’08, Lehmann et al. ’09, Yeh et al. ’09), but existing methods are restricted to rectangular or simple polygonal candidate windows. Problem: • Given a test image, we construct a region-graph on an oversegmentation: Example Detections Oversegmentation Region-graph ETHZ (point/shape features) PASCAL 2007 PASCAL 2008 seg • Vertex weights are obtained from SVM weights for: • Point feature words: SURF within the superpixel • Shape feature words: HoG on whole superpixel -0.1 0.11 0.49 0.15 -0.23 0.07 -0.05 MWCS instance 2. Extra features in a window can mislead the detector 1. A rectangle is imprecise • Edges set by adjacency, and to impose spatial layout. Maximum-Weight Connected Subgraph (MWCS) Problem Main Idea Identify the connected subgraph R* whose summed vertex weights are maximal. • Prize-collecting Steiner tree (PCST) problem: connected subgraph that maximizes sum of vertex weights minus (positive) edge costs • Convert MWCS PCST: subtract the smallest vertex weight from all vertex and edge weights. Goal: Identify the best-scoring region---the subset of spatially contiguous subregions whose features will maximize a classifier’s score. • While windows over/underestimate object, ERS allows precise arbitrarily-shaped detections. Naïve approach would require exponential time. Main contribution: We show how to obtain the best-scoring region efficiently with a branch-and-cut solution. Efficient Region Search with Contours (ERS-C) Pixel-level precision recall curves on PASCAL 2007 (cat, dog) and ETHZ for our approach and ESS A variant of ERS to help exclude background regions Our Approach Contour strengths Maximum-weight connected subgraph → Prize-collecting Steiner tree problem Weight each superpixel vertex by classifier output on its features • Class-specific edge weights via bag-of-contour strengths Divide image into superpixels and construct region-graph Branch-and-cut to find best connected subgraph • ERS more accurate than ESS, even under bounding box metric (19-70% better). 4 Branch-and-Cut Solution • Shape features excel on ETHZ; region detection crucial for “non-boxy” objects. Branch-and-cut algorithm for PCST [Ljubic et al. ‘06] to obtain best scoring region: Detection overlap accuracy on PASCAL 2008 compared to the global connectivity CRF [Nowozin et al. CVPR 2009] Applicable to classifiers whose total score is sum of localized feature scores (e.g., linear SVM, Naïve Bayes NN, boosting). • Optimal solutions • Efficient in practice (100s of nodes) Background: Linear SVM with BoW • Our optimal solution leads to significantly more accurate results on this challenging dataset. Training: Learning the Weights As noted by Lampert et al. ‘08, for a linear SVM and bag-of-words, the classifier response for a region R can be written as sum of its N features’ word weights: • Visual word histogram weights – linear SVM on segmented examples • ERS search times similar to ESS, and orders of magnitude faster than sliding windows. • Unlike ESS, ERS permits pixel-level detections of any shape. • Bag-of-contours histogram weights – structured SVM SVM weight for j-th word Num occurrences of j-th word SVM weight for i-th feature’s word Datasets Conclusions Our goal is to determine the arbitrarily shaped region within a novel image that maximizes the score: • An efficient branch-and-cut method for region-based detection • Demonstrated its advantages over both window-based detection and a CRF model • In future work, we will examine the alternate classifiers accepted by our model. ETHZ Shapes: 5 classes PASCAL 2007: cat, dog PASCAL 2008 seg: 20 classes