NTT Visit: Image Database Retrieval Variable Viewpoint Reality

NTT Visit: Image Database Retrieval Variable Viewpoint Reality Professor Paul Viola Collaborators: Professor Eric Grimson, Jeremy De Bonet, John Winn, Owen Ozier, Chris Stauffer, John Fisher, Kinh Tieu, Dan Snow, Tom Rikert, Lily Lee, Raquel Romano, Janey Hshieh, Mike Ross, Nick Matsakis, Jeff Norris, Todd Atkins Mark Pipes

Overview of Visit • Morning: Image Database Retrieval • Gatekeeper: Face detection and recognition • Complex Feature Image Database Retrieval (Tieu) • Flexible Template Retrieval (Yu) • Interlude • Video/Audio Source Separation (Fisher) • Mathematical Expression Recognition (Matsakis) • Lunch • Visit Prof. Brooks lab

Overview of Visit - 2 • Afternoon: Variable Viewpoint Reality • Real-time 3D reconstruction of people (Snow) • Automatic camera calibration (Snow + Lee) • Tracking of articulate human models (Lee + Winn) • Modeling of human dynamics (Viola + Fisher)

Gatekeeper:Receptionist & Security • Greet guests • Direct people to their destinations • Recognize employees • Turn back unauthorized visitors

Gatekeeper in action … Gatekeeper Movie

Gatekeeper is a constant observer… Professor Paul Viola

Detecting faces is very difficult

Detecting and Recognizing Faces • Key Difficulty: Variation in Pose • State of the art: generalized templates • Neural Networks / Deformable Templates / etc. • Templates have difficulty with pose variation… • Rotation, scale, complex deformation • Must reduce the dependence on relative pose. • Approach: Detecting people as a statistical distribution of multi-scale features

Statistical Distribution of Multi-scale Features The distribution of multi-scale features determines appearance Wavelet Pyramid

A multi-scale feature associates many values with each pixel in the image Multi-scale Wavelet Features

Discrimination via Cross Entropy IMODEL Cross Entropy ITEST

Motivation: Finding vehicles in clutter BTR70-C71 T72-132 Supported by Darpa: IU/ATR initially MSTAR Extension

Can also be used for segmentation…

Original Texture The multi-scale statistical model can be used to generate new example textures Synthesis Results

Synthesis Procedure Step 1: Build analysis pyramid 2x2 64x64 Input Image Note: We are using only the Gaussian pyramid here! Normally we use an oriented pyramid...

Synthesis Procedure Step 2: Build synthesis pyramid

Synthesis Procedure Step 2a: Fill in the top... Pixels are generated by sampling from the analysis pyramid.

Synthesis Procedure Step 2b: Fill in subsequent levels Pixels are generated by conditional sampling (dependent on the parent).

Synthesis Procedure Finish the pyramid Decisions made at low resolutions generate discrete features in the final image.

Detection Results Non-face test images Web face test images

1000 bins or less! Pruning the density estimator Reduce the number of bins through clustering Result: Detection/Classification is faster than template correlation

Key facial features - determined automatically - located automatically Multi-scale features which are come from the face model can be automatically detected for many individuals

Another key feature

New Face Recognition Algorithm • Measure the occurrence and location of “key” facial features. • Facial identity depends both on the types of features and their location. • Relation to Active Search… • Match measure is a histogram of multiscale features • Like color histogram, Active Search can be used...

Presentation on Image Database Technology

NTT Visit: Image Database Retrieval Variable Viewpoint Reality