230 likes | 308 Views
An opposition to Window-Scanning Approaches in Computer Vision. Presented by Tomasz Malisiewicz March 6, 2006 Advanced Perception @ The Robotics Institute. 2 Problems. Does scanning windows across an image work? What types of objects does it work for?. Context. aka Top-Down Processing.
E N D
An opposition to Window-Scanning Approaches in Computer Vision Presented by Tomasz Malisiewicz March 6, 2006 Advanced Perception @ The Robotics Institute
2 Problems • Does scanning windows across an image work? • What types of objects does it work for?
Context aka Top-Down Processing *Following Slides Borrowed From Derek Hoiem’s “Putting Context Into Vision” Presentation What are window-scanning approaches missing?
Context What is context? • Any data or meta-data not directly produced by the presence of an object • Nearby image data
What is context? • Any data or meta-data not directly produced by the presence of an object • Nearby image data • Scene information Context Context
What is context? • Any data or meta-data not directly produced by the presence of an object • Nearby image data • Scene information • Presence, locations of other objects Tree
Clues for Function • What is this?
Clues for Function • What is this? • Now can you tell?
Low-Res Scenes • What is this?
Low-Res Scenes • What is this? • Now can you tell?
More Low-Res • What are these blobs?
More Low-Res • The same pixels! (a car)
Why is context useful? • Objects defined at least partially by function • Trees grow in ground • Birds can fly (usually) • Door knobs help open doors
Why is context useful? • Objects defined at least partially by function • Context gives clues about function • Not rooted into the ground not tree • Object in sky {cloud, bird, UFO, plane, superman} • Door knobs always on doors
Why is context useful? • Objects defined at least partially by function • Context gives clues about function • Objects like some scenes better than others • Toilets like bathrooms • Fish like water
Why is context useful? • Objects defined at least partially by function • Context gives clues about function • Objects like some scenes better than others • Many objects are used together and, thus, often appear together • Kettle and stove • Keyboard and monitor
The other* problem • What types of objects does it work for? *Assuming we can just directly avoid the first problem
“However, such approaches seem unlikely to scale up to the detection of hundreds or thousands of different object classes because each classifier is trained and run independently.” – Torralba and Murphy and Freeman from Sharing features: efficient boosting procedures for multiclass object detection • “Our goal is to develop a system that detects and recognizes many kinds of objects in photographs and video including everyday office objects, text captions in video, and various structures in biomedical imagery.” – Schneiderman and Kanade from Object Detection Using the Statistics of Parts How many different classifiers must one construct? A different classifier for each object? A different classifier for each pose of an object? How many poses do we need per object?
Too many windows • Now imagine scanning a window and applying 100K independent classifiers at each window
Conclusion • Without context, we can’t find all things we want to find. We need context to help constrain the search for objects. • With independent classifiers per object (and per pose), we can’t detect a large number of objects. Should cow detectors and a horse detectors be built independently? Think along the lines of a horse and a cow are types of animals that often occur in similar contexts. • Remember that complex and deformable objects would require many poses if are to adhere to the window-based classifier paradigm.
Thank you. *Pascal 2006 Visual Challenge Image