Extreme, Non-parametric Object Recognition

Extreme, Non-parametric Object Recognition 80 million tiny images (Torralba et al)

Our World is Boring… Slide by Antonio Torralba

Lots Of Images A. Torralba, R. Fergus, W.T.Freeman. PAMI 2008

Lots Of Images

Automatic Colorization Result Grayscale input High resolution Colorization of input using average A. Torralba, R. Fergus, W.T.Freeman. 2008

Automatic Orientation Many images have ambiguous orientation Look at top 25% by confidence: Examples of high and low confidence images: Slide by Antonio Torralba

Automatic Orientation Examples A. Torralba, R. Fergus, W.T.Freeman. 2008

What If we have Labels…

Are 32x32 images enough?

LabelMe Tiny images Caltech 101 10% of the objects account for 90% of the data ~Zipf’s law Slide by Antonio Torralba

Do people do this?

What’s the Capacity of Visual Long Term Memory? What we don’t know… What we know… … what people are remembering for each item? Standing (1973) 10,000 images 83% Recognition According to Standing “Basically, my recollection is that we just separated the pictures into distinct thematic categories: e.g. cars, animals, single-person, 2-people, plants, etc.) Only a few slides were selected which fell into each category, and they were visually distinct.” … people can remember thousands of images Dogs Playing Cards High Fidelity Visual Memory is possible (Hollingworth 2004) “Gist” Only Sparse Details Highly Detailed Slide by Aude Oliva

Massive Memory I: Methods 1024-back 1-back ... ... ... Showed 14 observers 2500 categorically unique objects 1 at a time, 3 seconds each 800 ms blank between items Study session lasted about 5.5 hours Repeat Detection task to maintain focus Followed by 300 2-alternative forced choice tests Slide by Aude Oliva

Slide by Aude Oliva

how far can we push the fidelity of visual LTM representation ? Same object, different states Slide by Aude Oliva

Massive Memory I: Recognition Memory Results Replication of Standing (1973) 92% Visual Cognition Expert Predictions Slide by Aude Oliva

Massive Memory I: Recognition Memory Results 92% 88% 87% Slide by Aude Oliva

Extrapolation of Repeat Detection Data Human performances for n = 1024 Quadratic (r2=.988) Power law (r2=.988) Slide by Aude Oliva Brady, Konkle, Alvarez, Oliva (submitted)

Past and future of image datasets in computer vision Human Click Limit (all humanity takingone picture/secondduring 100 years) COREL Lena a dataset in one picture 2 billion 40.000 2020? 1972 1996 2007 Number of pictures 1020 1015 1010 105 100 Time Slide by Antonio Torralba

Extreme, Non-parametric Object Recognition