240 likes | 392 Views
Video Computing as a New Trend in Computer Vision. 2002. 10. 4 MAI LAB RYU MI WYUN. Contents. Introduction : Computer Vision in the Twenty-First Century Motion-Based Video Representation. Guest Introduction : The Changing Shape of Computer Vision In the Twenty-First Century.
E N D
Video Computing as a New Trend in Computer Vision 2002. 10. 4 MAI LAB RYU MI WYUN
Contents • Introduction : Computer Vision in the Twenty-First Century • Motion-Based Video Representation
Guest Introduction : The Changing Shape of Computer Vision In the Twenty-First Century MUBARAK SHAH Computer Vision Lab, School of Electrical Engineering and Computer Science, University of Central Florida, Orlando, FL 32816, USA, shah@cs.ucf.edu International Journal of Computer Vision 50(2), (2002) 103-110
Computer Vision • Computer Vision (Image Understanding) • Started as an AI problem • Understand a single image of a scene, Locate and identify object, Determine their structures, Spatial arrangements, etc High level Vision problems Low level Vision problems • copy demo : for a robot to build an exact copy of an image • Extract lines from images etc.
Computer Vision 2 • During the seventies and the next two or three decades, • developing algorithms for recovering 3D shape from 2D images using stereo, motion, shading, texture, etc Laplacian of Gaussian edge detector (Marr and Hildreth, 1980) , Relaxation-based Stereo algorithms (Marr and Poggio, 1979/ Ohta and Janade, 1985) • After the Marr era, • Complex and new mathematical techniques in computer vision : Almost forget about the original problem
Computer Vision (3) • Currently, • Some shape-from-X problems have been almost solved, and are being used in industry • The transition away from understanding single images to analyzing image sequences, or video understanding • Video Understanding • Understanding video sequences, e.g., recognition of gestures, activities, and facial expressions • Mainly focused on analysis of human motion • Can solve several problems (video synthesis, segmentation, etc.) Recognition of static objects Motion-based recognition of actions and events
Video Computing • To treat all this work, where motion in a sequence of images , as one entity, which we call video computing • Motion occurs in 3D but is projected on 2D in video images The challenge is to solve these problems using 2D image motion Video Synthesis Surveillance & Monitoring Video Computing Video Compression Video Registration Video Segmentation
Video Synthesis & Video Compression • Video Synthesis • the generation of realistic video, belongs to the computer graphics ex. Image-based rendering and modeling approaches using vision techniques, Dynamic view morphing (Manning and Dyer, 1999, etc.) • Applied in several applications ex. Filling a gap in a movie, Creating virtual views from static images, Switching camera views from one to another • Video Compression • Based on the difference between the real video and synthesized video, the model parameters are updated and coded for transmission • Compression of face video is the most attention within the graphics and vision community
Video Registration & Video Segmentation • Video Registration • deals with the alignment of video frames/a reference image or model ex. The techniques for estimation of global motion • Increasingly used in several areas ex. MRI data of a particular patient registered with reference image • Video Segmentation • Probably the oldest vision problem ( edge detection, region segmentation..) • Useful for object-based compression, activity recognition, tracking, etc.
Video Surveillance and Monitoring • Video Surveillance • Rapidly growing area of video computing, particularly after 9.11 events Detect moving objects in the video Track objects through the sequence Classify them into people, vehicles, etc. Recognize their activities • Needs more research for tracking of non-rigid objects (human, etc.)
Motion-Based Video Representation for Scene Change Detection CHONG-WAH NGO*, TING-CHUEN PONG**, HONG-JIANG AHANG*** *Department of Computer Science, City University of Hong Kong **Department of Computer Science, The Hong Kong University of Science & Technology ***Microsoft Research Asia, 5/F Beijing Sigma Center International Journal of Computer Vision 50(2), (2002) 127-142
Introduction • Decomposing a video into scenes, : to propose a framework for structuring the content of videos • Shot – An uninterrupted segment of video frame sequence • Scene – A series of consecutive shots that are coherent • Video – consists of scenes • Scene Change Detection : must consider shot representation and similarity measure
Framework Video Partitioning • More than 1 motion Shot Motion Characterization • Segmented into motion coherent units Sub-unit Multiple Motion? No Yes Sub-unit Sub-unit Back/Foreground Segmentation Dominant motion layer Adaptive keyframe Selection & Formation Background reconstruction • To compactly represent Keyframe Video representation Keyframe • Scene change detected by grouping shots with similar color content Color histogram intersection Similarity measure Time-constraint grouping Scenes Video
Processing of Spatio-Temporal Slices (STS) • Video partitioning, motion characterize, Back/Foreground segmentation • Ngo et al., 1999, 2000, 2001 • STS composed of color and texture components • Discontinuity – occurrence of a new event Shot boundary : A slice show a dramatic change Camera motion : inferred from the texture pattern Multiple motion : dissimilar texture patterns appear in a shot
Processing of Spatio-Temporal Slices (STS) 2 Static motion + zoom (a) Temporal Slice (b) Tensor histogram Static, pan, and static motions Temporal Slice Tensor histogram • Tensor histogram & Motion Characterization (a) Moving object (b) panning
Processing of Spatio-Temporal Slices (STS) 3 • Background Segmentation
Video Representation • Keyframe selection & Formation / Background reconstruction : to represent shots compactly and adaptively
Similarity Measure • Representative frames of shot : • Color histogram Histogram in hue, saturation, intensity
Similarity Measure • Representative frames of shot : • Threshold to decide if two shots belong to a same scene Histogram in hue, saturation, intensity
Time-Constraint Grouping • Belonging the same scene is related to time distance
Experiments • Tested on the video demo.mpg
Experiments 2 • Tested on the video lgerca_lisa_1.mpg
Experiments 3 • Tested on the video lgerca_lisa_1.mpg
Conclusion • Motion-based video representation : motion characterization + background reconstruction • Combining the histogram intersection for similarity measure / time constraint grouping algorithm Need to Improve background segmentation / reconstruction