Unsupervised Category Modeling, Recognition and Segmentation

Unsupervised Category Modeling, Recognition and Segmentation • Sinisa Todorovic and Narendra Ahuja

What is Common in a Set of Images? Images possibly contain an object of interest Which objects appear frequently in the set? What properties are shared by similar objects in the set? Where are the objects?

learn car model segment all cars unseen image RESULT Objective: Car Example occlusion no car occlusion multiple cars

GIVEN An unseen image Testing DETECT, RECOGNIZE AND SEGMENT All occurrences of the learned category Problem Definition GIVEN Images possibly containing frequent occurrences of similar objects DETERMINE Training If similar objects are present AND IF YES LEARN The model of similar objects

Prior Work Dominated By: • Statistical modeling of local features: patches or curve fragments • Trend: Object detection = Image classification • Trend: Object segmentation = Object localization • Trend: Object segmentation = Binary thresholding of a probabilistic map • Hypothesize the number of objects and their parts • Hypothesize the topology of object parts • Each training image must contain a category of interest • Modeling background • Require typically hundreds of training images

Category Modeling is Very Difficult • Explicit modeling of recursive embedding of object subparts • Regions vs. local features open questions: • More informative? • More stable and robust to noise? • Regions allow: • simultaneous object detection and segmentation • explicit representation of the recursive embedding property

find ? image matching do ? structural learning Our Approach SIMILAR OBJECTS PRESENT IN THE SET MANY SUBIMAGES WITH SIMILAR REGION PROPERTIES ABUNDANT DATA ROBUST LEARNING IS FEASIBLE

Region Properties • Geometric • Region area • Boundary shape • Photometric • Gray-level contrast with the surround • Topology • Recursive containment of regions • Layout - relative region locations

Example segmentations for several contrasts Feature Extraction = Image Segmentation segmentation Homogeneous regions at ALL contrasts and sizes Image [N. Ahuja TPAMI ‘96, Tabb & Ahuja TIP ‘97, Arora & Ahuja ICPR ‘06]

Multiscale Segmentation to Segmentation Tree Sample cutsets Example segmentations Segmentation tree Contrast level ≠ Tree level

Image = Tree and Object = Subtree

Outline of Our Approach Images = Trees Category present = Many similar subtrees Extracting similar subtrees = Tree matching Category model = Union of similar subtrees Simultaneous detection, recognition and segmentation of ALL category instances by Matching the model with an image

Tree Matching: Structural Noise Edit-distance tree matching [Pelillo et a. PAMI‘99, Torsello&Hancock ECCV’02, PRL’03]

Matching Algorithm Input trees Matched subtrees

Matching Algorithm GIVEN two trees: FIND bijection which MAXIMIZES their similarity measure node saliency cost of node matching while PRESERVING ancestor-descendant relationships

SOLUTION Select all pairs with > threshold. Matching Algorithm: Recursive Solution descendants Maximum clique over all descendant pairs

Outline LEARNING

Learning algorithm estimates: 1) Model structure 2) Model parameters Category Model = Tree Union Tree intersection: Tree union:

Simultaneous Detection and Segmentation MATCHING

Performance Evaluation Criteria DETECTION ERROR Matched Subtrees (MST) Ground Truth (GT) False positive: “intersection of MST and GT” < 0.5 “union of MST and GT” SEGMENTATION ERROR Matched Subtrees (MST) Ground Truth (GT) “XOR of MST and GT”

10 positive out of 20 training images 5 positive out of 10 training images Results: UIUC Cars Side View Results on test images:

3 positive out of 6 training images 6 positive out of 12 training images Results: Faces -- Caltech 101 Database Results on test images:

Results: Caltech Cars Rear View 10 positive out of 20 training images

Recall-Precision Training from a small-size dataset Varying tradeoff recall vs. precision

Extracting similar subtrees: per image pair Learning: # of subtree nodes Learning on 32 subtrees extracted for UIUC CARS: < 1 hour Detection, recognition and segmentation: # of model nodes Processing time for UIUC CARS: < 10 sec, regardless of the total number of target objects Complexity and Runtime on 2.4GHZ 2GB RAM PC # of tree nodes Training on 20 images of UIUC CARS: < 2 hours

Summary and Conclusion • Unsupervised learning of an unknown category frequently occurring in a given set of images • Region-based, structural approach • Simultaneous detection, recognition, and segmentation of all category instances in unseen images • NO multiple detections on the same object • NO hypotheses on the number of objects and their parts • NO hypotheses on the topology of object parts • Small number of training images • Complexity comparable with standard methods

Acknowledgment THANK YOU! {sintod, n-ahuja}@uiuc.edu http://vision.ai.uiuc.edu

Unsupervised Category Modeling, Recognition and Segmentation