250 likes | 484 Views
Outline. Part I: Introduction Shot Boundary Detection Methods Combining Shot Boundary Algorithms Scene Segmentation Luminance Scene Segmentation Results Part II: MPEG-7 Low-Level Descriptor based Video Segmentation Conclusions. Introduction (1) Navigating digital video.
E N D
Outline • Part I: • Introduction • Shot Boundary Detection Methods • Combining Shot Boundary Algorithms • Scene Segmentation • Luminance Scene Segmentation Results • Part II: • MPEG-7 Low-Level Descriptor based Video Segmentation • Conclusions
Introduction (1) • Navigating digital video. • Segmentation needed to replace • Digital video is composed of : • Frame • Shots • Shot boundary • Scenes • Audio 3#23
Introduction (2) • Shot Segmentation problemsExamples • object motionperson moves into a camera shot ... • camera motionpanning, zooming … • lighting changescamera flash , lightning .. • some types of shot boundarydissolves , fades ...To reduce false shot changes • Threshold values higher values • Empirical restrictions example: shot must be greater than 100 frames ….. 4#23
Introduction (3) - Dissolve -A dissolve sequence is the mixture of two video sequences, where the first sequence is fading out while the second one is fading in.
Example Fade-In Fade-Out= Wipe Typical Cut Dissolve
Shot Boundary Detection Methods (1) • Color Histogram • Colour percentages for a frame is stored. • Results compared with that of the adjacent frame. • Difference value calculated. • Difference above a certain value (threshold) is shot change Compared with previous frames histogram values Histogram values generated difference value above certain value is shot change 6#23
Shot Boundary Detection Methods (2) • Edge Detection • frame turned into a grayscale image. • edge detection algorithm is then applied to the image. • difference value calculated for two adjacent frames. • difference above a certain value (threshold) is shot change Compared with previous frames edge values difference value 7#23
i P i i i i i i i P B B i ShotBoundary Detection Methods (3) • Macroblock (compressed domain) • works on compressed MPEG digital video. • Frame split into fixed regions called macroblocks • Three types of macroblock • I : encoded independently of other macroblocks • P : encode not the region but the motion vector and error block of the previous frame • B : same as above except that the motion vector and error block are encoded from the previous or next frame • Detecting shot changes specific numbers of macroblock types will occur Frame with macroblocks P B 8#23
ShotBoundary Detection Methods (4) Spatio-Temporal Slice Model Video Key- Frames Slides computed 8#23
Evaluation of Methods Two evaluation measures are: Number of correct shots found Recall : Actual number of shots Number of correct shots found Precision : Number of correct shots found + false shots • There is a balance between these measures 9#23
Evaluation of Methods • Average Precision values over 8 hours • Colour Histogram 90.4 • Edge Detection 90.0 • Macroblock 87.4 • Average Recall values over 8 hours • Colour Histogram 78.9 • Edge Detection 70.2 • Macroblock 75.3 • Programs with lowest Recall values are: • Home & Away (Australian soap) • Cooking Program Taken from : Paul Browne Centre for Digital Video Processing Dublin City University 11#23
Combining Shot Boundary Algorithms • Logic of the combining method that selects a shot boundary: • if difference value(s) above threshold value(s) then shot boundary Method(s) difference value Thresholds Colour Histogram Low Histogram or or Edge Detection Macroblock Shot boundary 13#23
Scene Segmentation • Approach • Luminance based segmentation • Problems • Scene is a semantic concept • Computer needs wide domain knowledge • Typical scene will contain many large changes • in light and colour over its duration
Luminance Scene Segmentation • Method designed to detect location based scenes • Method operation: • Compare adjacent shots using existing shot boundary results • Look for large changes in light to detect scene changes • Those above threshold are selected as candidates • When all shots compared apply a second low threshold to all candidate scenes • Finally apply a minimum gap between scenes 16#23
Part II Temporal video segmentation using MPEG-7 shot boundary detection Taken from: MPEG-7: Application of MPEG-7 Descriptors for Temporal Video Segmentation Michael Höynck Institute for Communications Engineering Aachen University of Technology of Technology (RWTH)Germany
Basics of the method • MPEG-7 standardizes description of multimedia content: –reusing information (once before extracted) reduces complexity of subsequent video processing –multimedia content description can be shared, exchanged and extended by heterogeneous multimedia processing systems –compact descriptors • we assume having MPEG-7 Scalable Color and Edge Histogram information as input for the segmentation algorithm • further processing (cut detection and keyframe selection) with only low demand for processing power
SCD: Remember • Haar-transform based encoding scheme, applied to a 256-bin color histogram in HSVcolor-space • SCD representations can be stored in different resolutions, ranging from 256 down to 16 coefficients per histogram
EHD: Edge Histogram Descriptor • specifies the spatial distribution of five edge types in 16 image regions • global edge feature can be derived
Shot Detection Method • apply color-and edge histograms for segmentation • calculation of histogram difference measure D (e.g., a L1-norm) • color: twin comparison method • If Tb < diff shot boundary • Ts < diff < Tb accumulate differences • diff < Ts nothing • If the accumulated value (delta) is greater than Tb, a gradual change is detected.
Performance Evaluation of Shot Detection Overall performance results: • 97% recall and 80% precision on testset • definition of testset (natural, synthetic, genres) • determination of ground-truth (1170 shots) • performance evaluated with respect to recall and precision
Some Conclusions • It is possible, but not always, to improve the overall Recall performance of shot boundary methods by combining them. • Precision and Recall performance will depend on the threshold levels used • Scene segmentation is feasible on highly structured content like news and location based scenes 22#23