1 / 19

SVMs for (x) Recognition

SVMs for (x) Recognition. (From Moghaddam / Yang’s “Gender Classification with SVMs”) Brian Whitman. “Commodity Intelligence”. ‘Wow factor’ important Collaborative filtering ‘Simple’ tasks sometimes the most useful An SVM embedded evaluator… Cameras with ‘common sense’.

ura
Download Presentation

SVMs for (x) Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SVMs for (x) Recognition (From Moghaddam / Yang’s “Gender Classification with SVMs”) Brian Whitman

  2. “Commodity Intelligence” • ‘Wow factor’ important • Collaborative filtering • ‘Simple’ tasks sometimes the most useful • An SVM embedded evaluator… • Cameras with ‘common sense’

  3. Why SVM for feature detection? • Quick evaluation model • Machines (SVs) are easily stored and small

  4. Experiment: Gender ID • Using MITFaces dataset • ~7500 faces with varying genders, races, ages, expressions, ‘extras’ • All aligned 160x160 with left eye at 80,80 • Face content is usually only 80x40

  5. MITFaces examples

  6. Representation? • Simple pixel values • Why?

  7. Sample size • Maintain ‘ground rule’ of ML • Dimensions < Examples*2 • At 3200 dims (80x40), this is hard • Training parameters (maximum lagrangians, kernel width) help • We use 80x40 and 40x20 in our examples

  8. Training stage • Choose 3200 random adult faces for training and 3200 random faces for testing • Extract 80x40 ‘face window’ from each face and treat the 3200 doubles (0..1) as a training example • Train SVM on pixel values of the train set (dual p4 xeon linux 2ghz -- 30 minutes)

  9. Testing Stage • Take the other 3200 face vectors and present them to the learned SVM • If class > 0, male, < 0, female. • Confidence: some linear combination of # of support vectors and magnitude of result • Had no problem doing this at 10hz on a PIII800 with tons running

  10. In-class face gender results • 80x40; C=100, aux=100 • 93% of faces classified correctly • 95% male • 90% female • 40x20; C=100, aux=10 • 97% • 98% male • 95% female

  11. Next step: Realtime • Media Lab is where webcams go to die • Webcam at 160x120, ‘face region’ to 80x40, downsampled to 40x20. • Webcam gets frames at 10hz, we greyscale it and present it to the previously trained SVM • Results… mixed

  12. Realtime examples • (If the demo crashes)

  13. ‘Creepybot’ • With better control over alignment • Monitors Windows clipboard • Same architecture as the Creepycam

  14. Creepybot Examples • (If the demo crashes)

  15. Other parameters • MITFaces has a great data label set • Train an SVM for appearance of each descriptor: • Race • Age • Gender • Expression • Moustache

  16. Per-class results (40x20, etc…) • “Adult or not” • Overall: 94% • (Not adult: 403/516) (78%) • (Adult): 2605/2684) (97%)

  17. Per-class results… • “Smiling or not” • Overall: 88% • (Not smiling: 1354/1520) (89%) • (Smiling: 1450 / 1672) (87%)

  18. Per-class results • “Serious or not” • Overall: 88% • (Not serious: 1517/1712) (89%) • (Serious: 1311/1484) (88%)

  19. Could we do better? • Representation is lacking • But results are surprisingly good • For realtime, need auto-alignment / rescaling, or a better representation • Could this lead to an invasion of cheap intelligent cameras, each with tacky switches for feature detection and marketing?

More Related