Review: Intro to recognition

Review: Intro to recognition • Recognition tasks • Machine learning approach: training, testing, generalization • Example classifiers • Nearest neighbor • Linear classifiers

Image features • Spatial support: Pixel or local patch Segmentation region Bounding box Whole image

Image features • We will focus mainly on global image features for whole-image classification tasks • GIST descriptors • Bags of features • Spatial pyramids

GIST descriptors • Oliva & Torralba (2001) http://people.csail.mit.edu/torralba/code/spatialenvelope/

Bags of features

Origin 1: Texture recognition • Texture is characterized by the repetition of basic elements or textons • For stochastic textures, it is the identity of the textons, not their spatial arrangement, that matters Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001; Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003

Origin 1: Texture recognition histogram Universal texton dictionary Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001; Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003

Origin 2: Bag-of-words models • Orderless document representation: frequencies of words from a dictionary Salton & McGill (1983)

Origin 2: Bag-of-words models US Presidential Speeches Tag Cloudhttp://chir.ag/projects/preztags/ • Orderless document representation: frequencies of words from a dictionary Salton & McGill (1983)

Bag-of-features steps • Extract local features • Learn “visual vocabulary” • Quantize local features using visual vocabulary • Represent images by frequencies of “visual words”

1. Local feature extraction • Regular grid or interest regions

1. Local feature extraction Compute descriptor Normalize patch Detect patches Slide credit: Josef Sivic

… 1. Local feature extraction Slide credit: Josef Sivic

… 2. Learning the visual vocabulary Slide credit: Josef Sivic

… 2. Learning the visual vocabulary Clustering Slide credit: Josef Sivic

… 2. Learning the visual vocabulary Visual vocabulary Clustering Slide credit: Josef Sivic

Review: K-means clustering • Want to minimize sum of squared Euclidean distances between features xi and their nearest cluster centers mk • Algorithm: • Randomly initialize K cluster centers • Iterate until convergence: • Assign each feature to the nearest center • Recompute each cluster center as the mean of all features assigned to it

… Example codebook Appearance codebook Source: B. Leibe

… Appearance codebook Another codebook Source: B. Leibe

Bag-of-features steps • Extract local features • Learn “visual vocabulary” • Quantize local features using visual vocabulary • Represent images by frequencies of “visual words”

Visual vocabularies: Details • How to choose vocabulary size? • Too small: visual words not representative of all patches • Too large: quantization artifacts, overfitting • Right size is application-dependent • Improving efficiency of quantization • Vocabulary trees (Nister and Stewenius, 2005) • Improving vocabulary quality • Discriminative/supervised training of codebooks • Sparse coding, non-exclusive assignment to codewords • More discriminativebag-of-words representations • Fisher Vectors (Perronnin et al., 2007), VLAD (Jegou et al., 2010) • Incorporating spatial information

Bags of features for action recognition Space-time interest points Juan Carlos Niebles, Hongcheng Wang and Li Fei-Fei, Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words, IJCV 2008.

Bags of features for action recognition Juan Carlos Niebles, Hongcheng Wang and Li Fei-Fei, Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words, IJCV 2008.

level 0 Spatial pyramids Lazebnik, Schmid & Ponce (CVPR 2006)

level 1 Spatial pyramids level 0 Lazebnik, Schmid & Ponce (CVPR 2006)

level 2 Spatial pyramids level 1 level 0 Lazebnik, Schmid & Ponce (CVPR 2006)

Results: Scene category dataset Multi-class classification results(100 training images per class)

Multi-class classification results (30 training images per class) Results: Caltech101 dataset

Review: Intro to recognition

Review: Intro to recognition

Presentation Transcript

An overview of the SPHINX Speech Recognition System

Child Abuse: Recognition and Reporting

Use of Sound in Games

3D Model-Based Hand Gesture Recognition and Tracking

Word Recognition

Revenue Recognition

The Strecker Memorandum

Intro to VHDL

Recognition Part I

Intro to the Renaissance

Conditional Random Fields for Automatic Speech Recognition

Network Layer: Non-Traditional Wireless Routing Localization Intro

Chapter 6

SECTION I

Institute of Information Theory and Automation Introduction to Pattern Recognition

Abductive Plan Recognition By Extending Bayesian Logic Programs

Introduction to Pattern Recognition Chapter 1 ( Duda et al.)

Intro to VHDL

CONCUSSION RECOGNITION AND MANAGEMENT

Design and Implementation of Speech Recognition Systems

Sequence Scoring Experiments Using the TIMIT Corpus and the HTK Recognition Framework

Conditional Random Fields for Automatic Speech Recognition