410 likes | 599 Views
Visual Object Recognition. Rob Fergus Courant Institute, New York University. http://cs.nyu.edu/~fergus/icml_tutorial/. Agenda. Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition
E N D
Visual Object Recognition Rob Fergus Courant Institute, New York University http://cs.nyu.edu/~fergus/icml_tutorial/
Agenda • Introduction • Bag-of-words models • Visual words with spatial location • Part-based models • Discriminative methods • Segmentation and recognition • Recognition-based image retrieval • Datasets & Conclusions
Recognizing and Learning Object Categories: Year 2007 Li Fei-Fei, Princeton Rob Fergus, NYU Antonio Torralba, MIT http://people.csail.mit.edu/torralba/shortCourseRLOC
Agenda • Introduction • Bag-of-words models • Visual words with spatial location • Part-based models • Discriminative methods • Segmentation and recognition • Recognition-based image retrieval • Datasets & Conclusions
Object categorization mountain tree building banner street lamp vendor people
Scene and context categorization • outdoor • city • …
meters Ped Ped Car meters Application: Assisted driving Pedestrian and car detection Lane detection • Collision warning systems with adaptive cruise control, • Lane departure warning systems, • Rear object detection systems,
Application: Improving online search Query: STREET Organizing photo collections
Challenges 1: view point variation Michelangelo 1475-1564
Challenges 3: illumination slide credit: S. Ullman
Challenges 4: background clutter Bruegel, 1564
Challenges 5: occlusion http://lh5.ggpht.com/_wJc6t2hDl2M/RrL7Gh6sS7I/AAAAAAAAAYY/n3xaHc2opls/DSC00633.JPG
Challenges 6: deformation http://img.timeinc.net/time/asia/magazine/2007/1112/racehorse_1112.jpg Xu, Beihong 1943
History: single object recognition Object 1 Object 2 Object 3
Single object recognition history: Geometric methods David Lowe [1985] Rothwell et al. [1992]
Single object recognition history: Appearance-based methods • Murase & Nayer 1995 • Schmid & Mohr 1997 • Lowe, et al. 1999, 2003 • Mahamud and Herbert, 2000 • Ferrari et al. 2004 • Rothganger et al. 2004 • Moreels and Perona, 2005 • …
Challenges 7: intra-class variation Shoe class Instance 1 Instance 2 Instance 3
Fischler, Elschlager, 1973 • Turk and Pentland, 1991 • Belhumeur, Hespanha, & Kriegman, 1997 • Rowley & Kanade, 1998 • Schneiderman & Kanade 2004 • Viola and Jones, 2000 • Heisele et al., 2001 • Amit and Geman, 1999 • LeCun et al. 1998 • Belongie and Malik, 2002 • DeCoste and Scholkopf, 2002 • Simard et al. 2003 • Poggio et al. 1993 • Argawal and Roth, 2002 • Schneiderman & Kanade, 2004 • …..
Three main issues • Representation • How to represent an object category • Learning • How to form the classifier, given training data • Recognition • How the classifier is to be used on novel data
Representation • Generative / discriminative / hybrid
Representation • Generative / discriminative / hybrid • Appearance only or location and appearance
Representation • Generative / discriminative / hybrid • Appearance only or location and appearance • Invariances • View point • Illumination • Occlusion • Scale • Deformation • Clutter • etc.
Representation • Generative / discriminative / hybrid • Appearance only or location and appearance • Invariances • Part-based or global with sub-window
Representation • Generative / discriminative / hybrid • Appearance only or location and appearance • Invariances • Parts or global w/sub-window • Use set of features or each pixel in image
Learning • Unclear how to model categories, so learn rather than manually specify
Learning • Unclear how to model categories, so learn rather than manually specify • Methods of training: generative vs. discriminative
Learning • Unclear how to model categories, so learn rather than manually specify • Methods of training: generative vs. discriminative • Level of supervision • Manual segmentation; bounding box; image labels; noisy labels Contains a motorbike
Learning • Unclear how to model categories, so learn rather than manually specify • Methods of training: generative vs. discriminative • Level of supervision • Manual segmentation; bounding box; image labels; noisy labels • -- Training images: • Issue of over-fitting (typically limited training data) • Negative images for discriminative methods
Learning • Unclear how to model categories, so learn rather than manually specify • Methods of training: generative vs. discriminative • Level of supervision • Manual segmentation; bounding box; image labels; noisy labels • -- Training images: • Issue of over-fitting (typically limited training data) • Negative images for discriminative methods • -- Priors
Recognition • Scale / orientation range to search over • Speed • Context
Recognition • Context enables pruning of detector output Hoiem, Efros, Herbert, 2006