The Analysis of Faces in Brains and Machines

9.523 Aspects of a Computational Theory of Intelligence The Analysis of Faces in Brains and Machines Rafael Reif stay tuned...

Why is face analysis important for intelligence? Remember/recognize people we’ve seen before Categorization – e.g. gender, race, age, kinship Social communication – emotions/mood, intentions, trustworthiness, competence or intelligence, attractiveness Scene understanding, e.g. direction of gaze suggests focus of attention

Why is face recognition hard? changing pose changing illumination aging clutter occlusion changing expression

How good are we at face recognition? Jenkins, White, Van Montfort& Burton, Cognition, 2011

Face recognition performance in humans chance performance testmybrain.org Wilmer et al., 2012 Duchaine & Nakayama, 2006

Face recognition performance in humans Which of the 10 photos on the bottom depicts the target face? Viewers are ~ 70% correct Performance degrades with changes in pose, expression Only slight improvement with short video clip of target Importance of familiar vs. unfamiliar face recognition! Bruce et al., 1999

How good are the best machines? Public databases of face images serve as benchmarks: Labeled Faces in the Wild (LFW, http://vis-www.cs.umass.edu/lfw) > 13,000 images of celebrities, 5,749 different identities YouTube Faces Database (YTF, http://www.cs.tau.ac.il/~wolf/ytfaces) 3,425 videos, 1,595 different identities Private face image datasets: (Facebook) Social Face Classification dataset 4.4 million face photos, 4,030 different identities (Google) 100-200 million face images, ~ 8 million different identities

Machine vision applications of face recognition security, forensics access control surveillance

More applications of face recognition content-based image retrieval social media graphics, HCI humanoid robots

Aspects of face processing Face detection – find image regions that contain faces Face identification – who is the person? Categorization – gender, age, race Facial expression – mood, emotion Non-verbal social perception and communication

It all began with Takeo Kanade (1973)… PhD thesis, Picture Processing System by Computer Complex and Recognition of Human Faces • Special purpose algorithms to locate eyes, nose, mouth, boundaries of face • ~ 40 geometric features, e.g. ratios of distances and angles between features

Eigenfaces for recognition (Turk & Pentland)Principal Components Analysis (PCA) Goal: reduce the dimensionality of the data while retaining as much information as possible in the original dataset PCA allows us to compute a linear transformation that maps data from a high dimensional space to a lower dimensional subspace

Typical sample training set… One or more images per person Aligned & cropped to common pose, size Simple background Sample images from the Yale face database, results from C. deCoro http://www.cs.princeton.edu/~cdecoro/eigenfaces/

Eigenfaces for recognition (Turk & Pentland) Ψ(x,y) Perform PCA on a large set of training images, to create a set of eigenfaces, Ei(x,y), that span the data set First components capture most of the variation across the data set, later components capture subtle variations Ψ(x,y): average face (across all faces) http://vismod.media.mit.edu/vismod/demos/facerec/basic.html Each face image F(x,y) can be expressed as a weighted combination of the eigenfacesEi(x,y): F(x,y) = Ψ(x,y) + Σiwi*Ei(x,y)

Representing individual faces Each face image F(x,y) can be expressed as a weighted combination of the eigenfacesEi(x,y): F(x,y) = Ψ(x,y) + Σiwi*Ei(x,y) • Recognition process: • Compute weights wi for novel face image • Find image m in face database with most similar weights, e.g.

Changing expressions & lighting Eigenfacesapproach handles changes in facial expression ok… … but not changes in lighting (results from C. deCoro)

Face detection: Viola & Jones Multiple view-based classifiers based on simple features that best discriminate faces vs. non-faces Most discriminating features learned from thousands of samples of face and non-face image windows Attentional mechanism: cascade of increasingly discriminating classifiers improves performance

Viola & Jones use simple features Use simple rectangle features: ΣI(x,y) in gray area – ΣI(x,y) in white area within 24 x 24 image sub-windows  Initially consider 160,000 potential features per sub-window!  features computed very efficiently Which features best distinguish face vs. non-face? Learn most discriminating features from thousands of samples of face and non-face image windows

Learning the best features weak classifier using one feature: x= image window f= feature p = +1 or -1  = threshold … n training samples, equal weights, known classes  (x1,w1,1) (xn,wn,0) find next best weak classifier normalize weights final classifier AdaBoost ~ 200 features yields good results for “monolithic” classifier use classification errors to update weights

“Attentionalcascade” of increasingly discriminating classifiers • Early classifiers use a few highly discriminating features, low threshold • 1stclassifier uses two features, removes 50% non-face windows • later classifiers distinguish harder examples  Increases efficiency  Allows use of many more features  Cascade of 38 classifiers, using ~6000 features

Training with normalized faces 5000 faces many more non-face patches faces are normalized for scale, rotation small variation in pose

Viola & Jones results With additional diagonal features, classifiers were created to handle image rotations and profile views

Feature based vs. holistic processing Composite Face Effect Face Inversion Effect • identical top halves seen as different when aligned with different bottom halves • when misaligned, top halves perceived as identical • inversion disrupts recognition of faces more than other objects • prosopagnosics do not show inversion effect

Feature based vs. holistic processing Which features are more diagnostic? Whole-Part Effect Eyebrows are important! Test conditions Identification of the “studied” face is significantly better in the whole vs. part condition

View generalization mediated by motion? ✔ Hypothesis: Temporal association is used to link multiple views of a person’s face image sequences were created that morph between two different faces 12 female faces scanned for 3D shape and visual texture performance within morph groups was compromised by temporal association observers viewed morph sequences, back and forth same or different person? (shown separated in time) Wallis & Bulthoff, PNAS, 2001

The power of averages (Burton & colleagues) average “shape” • Improves accuracy in the recognition of famous faces • PCA • commercial system • human experiments average “texture”

Faces everywhere...

The Analysis of Faces in Brains and Machines

The Analysis of Faces in Brains and Machines

Presentation Transcript

Minds brains and machines Part II

Arithmetic Done by Brains and Machines: The Ersatz Brain Project

The 45 Faces of

The faces of wwii

The Brains of Hurricanes

The Faces of Hamlet

Connecting Brains with Machines

BATTLE OF THE BRAINS

Faces in the Wild

Faces of the Fast

The Faces of Globalization

The Faces of BIAC

The Faces of Ophelia

Battle of the Brains

“The Faces of Meth”

Faces of the Fast

Faces: Analysis and Synthesis

The brains of Men

The Faces and Facts of Disability

The different faces of

The Faces and Facts of Disability

Faces in the Wild