300 likes | 594 Views
Computer Vision: 3D Shape Reconstruction. Use images to build 3D model of object or site. 3D site model built from laser range scans collected by CMU autonomous helicopter. Computer Vision: Guiding Motion. Visually guided manipulation Hand-eye coordination Visually guided locomotion
E N D
Computer Vision: 3D Shape Reconstruction • Use images to build 3D model of object or site 3D site model built from laser range scans collected by CMU autonomous helicopter
Computer Vision: Guiding Motion • Visually guided manipulation • Hand-eye coordination • Visually guided locomotion • robotic vehicles CMU NavLab II
Challenges in Object Recognition 245 267 234 142 22 28 38 121 156 187 98 73 32 12 123 21 21 38 209 237 121 99 87 59 197 216 244
LargeQuantityofData Segmentationand HierarchicalAnalysis Robust Algorithms Lips Face Intra-class Object Variation Large number of Object Classes Hand Gesture Text License Plate Clock Vehicle Building Automated Learning Advanced Image Enhancement Low Image Quality Object Recognition Research Object Detection Quality/Quantity Issues Object Detection Issues
Simpler Problem: Classification • Fixed size input • Fixed object size, orientation, and alignment “Object is present” (at fixed size and alignment) Decision “Object is NOT present”(at fixed size and alignment)
Detection: Apply Classifier Exhaustively Search in position Search in scale
View-based Classifiers FaceClassifier #1 FaceClassifier #2 FaceClassifier #3
1) Apply Local Operators f1(0, 0) = #5710 f1(0, 1) = #3214 fk(n, m) = #723
2) Look Up Probabilities P1( #5710, 0, 0 | obj) = 0.53 f1(0, 0) = #5710 P1( #5710, 0, 0 | non-obj) = 0.56 P1( #3214, 0, 1 | obj) = 0.57 f1(0, 1) = #3214 P1( #3214, 0, 1 | non-obj) = 0.48 fk(n, m) = #723 Pk( #723, n, m | obj) = 0.83 Pk( #723, n, m | non-obj) = 0.19
3) Make Decision P1( #5710, 0, 0 | obj) = 0.53 P1( #5710, 0, 0 | non-obj) = 0.56 P1( #3214, 0, 1 | obj) = 0.57 0.53 * 0.57 * . . . * 0.83 > l P1( #3214, 0, 1 | non-obj) = 0.48 0.56 * 0.48 * . . . * 0.19 Pk( #723, n, m | obj) = 0.83 Pk( #723, n, m | non-obj) = 0.19
H1(#567, 0, 0) Hk(#350, 0, 0) f1(0, 0) = #567 fk(n, m) = #350 H1(#567, 0, 0) = H1(567, 0, 0) + 1 Hk(#350, 0, 0) = Hk(#350, 0, 0) + 1 P1(#567, 0, 0) = Pk(#350, 0, 0) = SH1(#i, 0, 0) SHk(#i, 0, 0) Probabilities Estimated Off-Line
Training Classifiers • Cars: 300-500 images per viewpoint • Faces: 2,000 images per viewpoint • ~1,000 synthetic variations of each original image • background scenery, orientation, position, frequency • 2000 non-object images • Samples selected by bootstrapping • Minimization of classification error on training set • AdaBoost algorithm (Freund & Shapire ‘97, Shapire & Singer ‘99) • Iterative method • Determines weights for samples
Web-based Demo of Face Detector http://www.vasc.ri.cmu.edu/cgi-bin/demos/findface.cgi