240 likes | 374 Views
Representing, Learning, and Recognizing Non-Rigid Textures and Texture Categories Svetlana Lazebnik Cordelia Schmid Jean Ponce Beckman Institute Gravir Laboratory Beckman Institute UIUC, USA INRIA, France UIUC, USA.
E N D
Representing, Learning, and Recognizing Non-Rigid Textures and Texture Categories Svetlana Lazebnik Cordelia Schmid Jean Ponce Beckman Institute Gravir Laboratory Beckman Institute UIUC, USA INRIA, France UIUC, USA Supported in part by the UIUC Campus Research Board, the UIUC/CNRS Collaborative Research Agreement, and the National Science Foundation under grant IRI-990709.
LeCun’03 • Affine-invariant patches. • 3D objects are never planar in the large, • but they are always planar in the small. • Representation: Local invariants and • their spatial layout.
(Mikolcajczyk & Schmid’02) (Lindeberg & Garding’97)
Spatial selection • Shape selection • Affine adaption Schaffalitzky & Zisserman (2001); Tuytelaars & Van Gool (2003)
Affine adaptation/Rectification process Image 2 Image 1 (0,1) (1,0) (0,0) Lindeberg & Garding (1997) Rectified patch Mikolcajczyk & Schmid (2002)
Intensity-Domain Spin Images [Range spin images: Johnson & Hebert (1998)]
System architecture (Lazebnik, Schmid, & Ponce, CVPR’03) • Signature: S = { ( m1 , w1 ) , … , ( mk , wk ) } • Earth Mover’s Distance: D( S , S’ ) = [i,jfij d( mi , m’j)] / [i,j fij] [Signatures and EMD for image retrieval: Rubner, Tomasi, & Guibas (1998)]
Texture retrieval/classification experiments 10 texture classes, with 20 samples per class. Schmid (2001); Varma & Zisserman (2002) Average recognition rate Average recognition rate NN classification
More retrieval/classification experiments: Brodatz database Average recognition rate Average recognition rate 111 images divided into 9 windows 111 classes with 9 samples per class • Picard et al. (1993, 1996) • Xu et al. (2000)
[NOTE: we do NOT use color information.] Texture Classes T1 (brick) T2 (carpet) T3 (chair) T4 (floor 1) T5 (floor 2) T6 (marble) T7 (wood) Multi-texture Samples
A Two-Layer Architecture (Lazebnik, Schmid, & Ponce, ICCV’03) • Modeling: • Use EM to learn a mixture-of-Gaussians model of each texture class. • Compute co-occurrence statistics of sub-class labels over affinely adapted neighborhoods. • Recognition: • Use the generative model to obtain initial class • membership probabilities. • Use relaxation (Rosenfeld et al., 1976) to refine these probabilities. Malik, Belongie, Leung, & Shi (2001); Schmid (2001); Kumar & Hebert (2003)
Neighborhood Statistics • Estimate: • probability p(c,c’), • correlation r(c,c’).
Relaxation (Rosenfeld et al., 1976) Iterate, for all regions i: where and wij=0 is region j is not in the neighborhood of i, with j wij=1.
Classification rates for single-texture images 10 training images per class, 10 test images per class.
Weakly-Supervised Modeling Idea: Replace L mixture models with M components by a single mixture model with L x M components. • Annotate each image with the set C of labels • associated with classes occurring in it. • Run EM: • E step: update class membership probabilities: • p (clm | x, C) / p ( x | clm ) p ( clm | C). • M step: update model parameters. Nigam, McCallum, Thrun & Mitchell (2000)
ROC Curves Single-texture training images only T1 (brick) T2 (carpet) T3 (chair) T4 (floor 1) T5 (floor 2) T6 (marble) T7 (wood) Single- and multi-texture training images T1 (brick) T2 (carpet) T3 (chair) T4 (floor 1) T5 (floor 2) T6 (marble) T7 (wood) 10 single-texture images per class, 13 two-texture training images, 45 multi-texture test images.
Effect of relaxation on labeling Original image Top: before relaxation, bottom: after relaxation
Animal Dataset • No manual segmentation. • 10 training images for each animal + background, 20 test images per class. Bradshaw, Scholkopf, & Platt (2001); Schmid (2001); Kumar & Hebert (2003)
3D Objects without distinctive texture • Category-level recognition • of 3D objects • Please join us in trying to solve the • 3D object recognition problem..