270 likes | 361 Views
Unsupervised Category Modeling, Recognition and Segmentation. Sinisa Todorovic and Narendra Ahuja. What is Common in a Set of Images?. Images possibly contain an object of interest. Which objects appear frequently in the set?. What properties are shared by similar objects in the set?.
E N D
Unsupervised Category Modeling, Recognition and Segmentation • Sinisa Todorovic and Narendra Ahuja
What is Common in a Set of Images? Images possibly contain an object of interest Which objects appear frequently in the set? What properties are shared by similar objects in the set? Where are the objects?
learn car model segment all cars unseen image RESULT Objective: Car Example occlusion no car occlusion multiple cars
GIVEN An unseen image Testing DETECT, RECOGNIZE AND SEGMENT All occurrences of the learned category Problem Definition GIVEN Images possibly containing frequent occurrences of similar objects DETERMINE Training If similar objects are present AND IF YES LEARN The model of similar objects
Prior Work Dominated By: • Statistical modeling of local features: patches or curve fragments • Trend: Object detection = Image classification • Trend: Object segmentation = Object localization • Trend: Object segmentation = Binary thresholding of a probabilistic map • Hypothesize the number of objects and their parts • Hypothesize the topology of object parts • Each training image must contain a category of interest • Modeling background • Require typically hundreds of training images
Category Modeling is Very Difficult • Explicit modeling of recursive embedding of object subparts • Regions vs. local features open questions: • More informative? • More stable and robust to noise? • Regions allow: • simultaneous object detection and segmentation • explicit representation of the recursive embedding property
find ? image matching do ? structural learning Our Approach SIMILAR OBJECTS PRESENT IN THE SET MANY SUBIMAGES WITH SIMILAR REGION PROPERTIES ABUNDANT DATA ROBUST LEARNING IS FEASIBLE
Region Properties • Geometric • Region area • Boundary shape • Photometric • Gray-level contrast with the surround • Topology • Recursive containment of regions • Layout - relative region locations
Example segmentations for several contrasts Feature Extraction = Image Segmentation segmentation Homogeneous regions at ALL contrasts and sizes Image [N. Ahuja TPAMI ‘96, Tabb & Ahuja TIP ‘97, Arora & Ahuja ICPR ‘06]
Multiscale Segmentation to Segmentation Tree Sample cutsets Example segmentations Segmentation tree Contrast level ≠ Tree level
Outline of Our Approach Images = Trees Category present = Many similar subtrees Extracting similar subtrees = Tree matching Category model = Union of similar subtrees Simultaneous detection, recognition and segmentation of ALL category instances by Matching the model with an image
Tree Matching: Structural Noise Edit-distance tree matching [Pelillo et a. PAMI‘99, Torsello&Hancock ECCV’02, PRL’03]
Matching Algorithm Input trees Matched subtrees
Matching Algorithm GIVEN two trees: FIND bijection which MAXIMIZES their similarity measure node saliency cost of node matching while PRESERVING ancestor-descendant relationships
SOLUTION Select all pairs with > threshold. Matching Algorithm: Recursive Solution descendants Maximum clique over all descendant pairs
Outline LEARNING
Learning algorithm estimates: 1) Model structure 2) Model parameters Category Model = Tree Union Tree intersection: Tree union:
Performance Evaluation Criteria DETECTION ERROR Matched Subtrees (MST) Ground Truth (GT) False positive: “intersection of MST and GT” < 0.5 “union of MST and GT” SEGMENTATION ERROR Matched Subtrees (MST) Ground Truth (GT) “XOR of MST and GT”
10 positive out of 20 training images 5 positive out of 10 training images Results: UIUC Cars Side View Results on test images:
3 positive out of 6 training images 6 positive out of 12 training images Results: Faces -- Caltech 101 Database Results on test images:
Results: Caltech Cars Rear View 10 positive out of 20 training images
Recall-Precision Training from a small-size dataset Varying tradeoff recall vs. precision
Extracting similar subtrees: per image pair Learning: # of subtree nodes Learning on 32 subtrees extracted for UIUC CARS: < 1 hour Detection, recognition and segmentation: # of model nodes Processing time for UIUC CARS: < 10 sec, regardless of the total number of target objects Complexity and Runtime on 2.4GHZ 2GB RAM PC # of tree nodes Training on 20 images of UIUC CARS: < 2 hours
Summary and Conclusion • Unsupervised learning of an unknown category frequently occurring in a given set of images • Region-based, structural approach • Simultaneous detection, recognition, and segmentation of all category instances in unseen images • NO multiple detections on the same object • NO hypotheses on the number of objects and their parts • NO hypotheses on the topology of object parts • Small number of training images • Complexity comparable with standard methods
Acknowledgment THANK YOU! {sintod, n-ahuja}@uiuc.edu http://vision.ai.uiuc.edu