350 likes | 533 Views
Scenario based Dynamic Video Abstractions using Graph Matching. Jeongkyu Lee University of Bridgeport. Outline. Introduction Graph-based Video Segmentation Multi-level Video Scenarios Dynamic Video Abstractions Experiments Conclusion and Future work. Introduction. Introduction
E N D
Scenario based Dynamic Video Abstractions using Graph Matching Jeongkyu Lee University of Bridgeport
Outline • Introduction • Graph-based Video Segmentation • Multi-level Video Scenarios • Dynamic Video Abstractions • Experiments • Conclusion and Future work
Introduction • Introduction • Video Abstractions • Limitations • Automatic Video Analysis System • Graph-based Video Segmentation • Multi-level Video Scenarios • Dynamic Video Abstractions • Experiments • Conclusion and Future work
Video Abstractions • Static video summary • A collection of static images of video segments • Nonlinear browsing of video • Dynamic video skimming • A shorter version of video arranged in time • Preserving the time-evolving element of video
Limitations • Static video summary • Loss of semantic contents of video, since it sacrificing its time-evolving element. • Dynamic video skimming • Very subjective • Gap between human cognition and automated results
Automatic Video Analysis System • Provides both static video summary and dynamic video skimming • Video segmentation by representing frames as graphs and matching them • Constructing a scene tree to illustrate hierarchical content of video • Generating multi-level scenarios using the scene tree • Multi-level highlights and multi-length summaries using the scenarios
Contributions • We propose a graph based shot boundary detection (SBD) and a graph similarity measure (GSM) to capture both the spatial and temporal relationships among video frames. • We generate multi-level scenarios of a video by accessing a scene tree on different levels to provide various levels of video abstraction. • We propose dynamic video abstractions which are able to generate both static video summary (i.e., multi-length summarizations) and dynamic video skimming (i.e., multi-level video highlights).
Graph-based Video Segmentation • Introduction • Graph-based Video Segmentation • Region Adjacency Graph (RAG) • Graph Similarity Measure (GSM) • Shot Boundary Detection • Multi-level Video Scenarios • Dynamic Video Abstractions • Experiments • Conclusion and Future work
RAG • Region segmentation using EDISON (Edge Detection and Image Segmentation System) • Region Adjacency Graph (RAG)
Neighborhood Graphs • Compare two RAGs to find Similarity between them • Decompose RAG into Neighborhood Graphs • Similarity between two neighborhood graphs
Maximal Common Subgraph • In order to find the largest common subgraph (GC), we first construct the association graph that is formed by creating nodes from each compatible pair of two nodes • We obtain GC by finding the maximal clique in the association graph • We call this algorithm as Maximal Common Subgraph, that is based on recursive programming
Maximal Common Subgraph • To find GC, use Maximal Common Subgraph
GSM • Graph Similarity Measure (GSM) • Simplified GSM by reducing search area
Shot Boundary Detection • Abrupt change: If GSMsim is more than a certain threshold value (Tcut), the two frames corresponding to the two RAGs are considered to be in the same shot. • Gradual change: Starting and ending frames can be found by tracking the low and continuous values of GSMsim
Multi-level Video Scenarios • Introduction • Graph-based Video Segmentation • Multi-level Video Scenarios • Scene Tree Construction • Multi-Level Scenario Selection • Dynamic Video Abstractions • Experiments • Conclusion and Future work
Scene Tree Construction • Create a scene node for each shot • Check a current shot is related to the previous shots using Corr • For two correlated scene nodes SNi and SNj • If SNi and SNj-1 do not have parent nodes -> create new parent node • If SNi and SNj-1 share the same ancestor node -> connect SNito the ancestor node • If SNi and SNj-1 do not share any ancestor node -> connect SNito the oldest ancestor node of SNi-1 • If there are more shots, go to step 2 • Determine key RAG for each node using
Dynamic Video Abstraction • Introduction • Graph-based Video Segmentation • Multi-level Video Scenarios • Dynamic Video Abstractions • Multi-level Video Highlights • Multi-length Video Summarization • Experiments • Conclusion and Future work
Multi-Level Video Highlights • Let L be a summary level of V selected by a user • Pick the shots corresponding to the scene nodes from a scenario with level L • Concatenate the selected shots to make a highlight video
Multi-Length Video Summarization • Let L and T be a summary level and a length • For each scene node in a scenario with level L, select a key RAG • For each selected key RAG, find its relevant frames using GSM • Concatenate the selected frames
Experimental results • Introduction • Graph-based Video Segmentation • Multi-level Video Scenarios • Dynamic Video Abstractions • Experiments • Data set • Efficiency of GSM • Performance of SBD • Evaluation of Video Abstraction • Conclusion and Future work
Data set • AVI format with 15 frames per second and 160 x 120 pixel resolution.
Evaluation of Video Abstractions • To assess the performance of the video summarization, we ask two questions to five reviewers. • They assign two scores ranging from 0 to 10 to each video. • A score of 0 means `Poor' while a score of 10 means `Excellent'. • To be fair, the reviewers are given a chance to modify the score any time during the review process. • The following Table shows the results of the performance evaluation of the video summarization from five reviewers. • Each parenthesis value is the standard deviation of assigned scores at a certain level.
Evaluation of Video Abstractions • Informativeness: How much of the content of the video clip do you understand? • Satisfaction: How satisfactory is the summarized video compared to the original?
Evaluation of Video Abstractions • For example, 80 % of the original video is reduced by the summarization at the bottom level. • However, the information of video content drops only around 15 %, and the user satisfaction is around 85 % for the same level summarization. • Even, the summary at the medium level keeps around 70 % of information as well as 70 % of user satisfaction, while the original video is compressed to 17.6 %.
Conclusions • Introduction • Graph-based Video Segmentation • Multi-level Video Scenarios • Dynamic Video Abstractions • Experiments • Conclusions • Conclusions
Conclusions • We propose a graph-based shot boundary detection and graph-based similarity measure (GSM) • We generate multi-level scenarios of a video by accessing a scene tree on different levels • We propose dynamic video abstractions that are able to generate both static video summary and dynamic video skimming.