250 likes | 323 Views
Computer Vision. Michael Isard and Dimitris Metaxas. Some Leading Questions. Shape and Motion Analysis based on Images Scene based and Object based approaches Coupling of low level shape/motion parameters with their semantic interpretation. Scene-based Approaches.
E N D
Computer Vision Michael Isard and Dimitris Metaxas
Some Leading Questions • Shape and Motion Analysis based on Images • Scene based and Object based approaches • Coupling of low level shape/motion parameters with their semantic interpretation
Scene-based Approaches • No need for explicit models • Used for camera calibration, pose estimation or image stabilization • Use rigid scene assumptions or optic flow, no use of object shape • A lot of progress has been done already • Limited when multiple motions occur
Object-based analysis • Used for accurate shape/motion analysis and for robustness • Coupling of shape and motion models • Need good shape models • Success with simple shapes, eg. Manmade objects, humans with tight clothes, jointed structures
Object-based analysis • Need good shape models • How do we model humans wearing clothes? • Stability and robustness of shape representation • what is the correct scale for shape representation • 2D or 3D shape?
Shape and Motion • How important is shape for motion? • Facial and Human tracking shape is necessary. • Occlusion and lighting changes
Shading and Motion • One of the still unsolved problems • Most methods still assume optical flow assumption • how to filter out lighting effects? • Relfections? • Multiple lights?
Grouping and Initialization • Most methods assume manual initialization: Where do you place your model? • Based on a single image while it may be best based on multiple images • Related to segmentation and grouping methods as well to higher level knowledge
Grouping and Initialization • 2D information and view-based or appearance based methods mostly • No need necessarily to be based on 3D shape which is very costly • no general algorithms despite some generic methods • Need to address: Light, occlusion, texture, grouping of parts for articulated objects
Segmentation • The grouping and assignment of features to an object or parts of an object • Still the bottleneck in most vision algorithms • Initially based on heuristics • Recently we have realized that statistical methods are superior eg Markov Random Fields
Experimental 1 Original Image (MRI Lung data)
Cycling Iteration 1 Iteration 2 Iteration 3
Visible Human Data Original MRI image of Eyeball and Muscle in Human Head Eye-ball Muscle
Image Acquisition for 3D Motion Information Tag plane (dark) and image plane orientation Possible tag motion in image plane Representative images
Statistics and Learning • Learning methods for statistical methods are more robust than intuitive statistics • Influenced by success in other fields like speech recognition • However the problem is 4D and there is no explicit ordering within the signal
Statistics and Learning • Learning is a limitation since it depends on many examples • Need new methods to approximate the distributions on appearance of natural scenes to reduce the complexity of the problem • Need to still be able to discriminate objects
Statistics and Learning • How do we develop statistical methods for coupling the low level shape/motion parametes with their semantic interpretation?
Multiple Scales • Shape and motion identification is dependent on the scale at which we do the processing • Recognize gross shape of an object • Recognize human intent • Recognition of human motion
Multiple Scales • How do scales interact? • Need some kind of statistical theory that takes into account multiple scales
Ligthing • Still an open problem • Most algorithms use aLambertian model • How do we cancel lighting artifacts, shadows, reflections, color constancy • shape from shading
Multiple Cues • How do we integrate multiple cues? • Optical flow, edges, features? • No principled theory • Need some theory to selectively use the right cues in a local fashion • Need more research on understanding the robustness of each of these cues • Need to get robustness by combining cues
Summary • Have gone a long way compared to the 70s and 80s. • But still the working algorithms deal with simple shapes, lighting conditions and are domain specific • Need research in all dimensions and also especially on theories that will span a wide variety of objects/motions