350 likes | 368 Views
Efficient Video Browsing. Using Multiple Synchronized Views Presenter: Teklu Urgessa. Authors. Arnon Amir, Savitha Srinivasan and Dulce Ponceleon IBM Almaden Research Center. Key Words for Publication. Video Retrieval Multimedia Browsing Video Browsing Synchronized Views
E N D
Efficient Video Browsing Using Multiple Synchronized Views Presenter: Teklu Urgessa
Authors • Arnon Amir, Savitha Srinivasan and Dulce Ponceleon • IBM Almaden Research Center
Key Words for Publication • Video Retrieval • Multimedia Browsing • Video Browsing • Synchronized Views • Audio Time Scale Modification(TSM) • Fast Playback • Video Browser • Anima-visualization Techniques • Slide Show • Adaptive Accelerating ……
Table of contents • Introduction • Traditional Methods • Problems with Traditional Methods • Advanced Technology/Methods • Technology for Visual • Technology for Audio • Summary • Reference
Text Browsing Vs MMB • Browsing Text Documents: Simple and fast • Browsing multimedia documents is not as easy text browning • It is complex and time consuming • Production and application of video contents is increasing from time to me. • The need for efficient way of video browsing is very crucial • The paper deals with different methods of efficient video browsing.
Factors for the Fast Growth DC • Digital Video Becomes common • From Our • Smart phones • Notebooks • Webcams • Digital camera and camcorders • Security and monitoring cameras • Advanced Streaming Technology • Fast Internet Access • MPEG-4 format • Diversity in the application areas of Video
Application Areas: Where Videos are Important • Entertainment • Education and Training • Distance Learning: Online Distance Learning • Medical and Technical Manuals • Advertisements . . .
Problem/Challenges • As the amount of video-rich (multimedia) data grows: • Finding and accessing becomes critical problem from large video repositories • Given Need of Users: Quick and Efficient Retrieval
Need of Research in the Area of the Efficient Video Retrieval Major research activities/efforts were underway in the last decade to find out best and efficient methods of video indexing, searching and retrieval.
Nature of Video Retrieval Research: Multidisciplinary • Areas of research: • Computer Vision • Pattern Recognition • Speech Recognition • Information Retrieval …
Basic Concepts: Searching and Browsing Both Activities are tightly Coupled • Searching: needs specific entries i.e. you can search for specific company or a person • Browsing: A generic approach; Eg. Korean Foods or Houses • A combination of both can also happen: First search the broader concept and the browse to reach at the specific concept and vice versa.
2.Traditional Methods Finding a Video Data • Search through categories • Similar to Internet shopping mall • We search for big categories • Then smaller categories • …and so on… • User should choose which to browse • Should check whether the selected data matches what user needs • Manual categorization and annotation • One by one? • Time consuming!
Problem with Traditional video search and browsing technologies The Authors stated that • Too complicated • Lack of efficient algorithm • Time consuming • Multimedia calculation are complex and demanding • Inaccuracy • Video data is increasing exponentially • manual Cataloging is a big limitation • Manual cataloging is error prone: lacks accuracy due subjectivity
3.Advanced Technologies for Image and Video retrieval • MPEG-7 Standards • Speech indexing • Shot Boundary Detection • Time Scale Modification of Audio Signals • Storyboards, Moving Storyboards and Animation • Adaptive Accelerating Fast Playback • Streaming Synchronized Views
MPEG-7: Multimedia Description Standard • Standardized by : • International Standard Organization (ISO) • International Electro-technical Commission (IEC) • ISO/IEC 15938 (Multimedia content description interface) • Not a video encoding format of moving pic like MPEG-1-4 • MPEG uses XML to store metadata/description • The description can be attached to timecode in multimedia in order to tag particular event. • By this tag • Able to index and search efficiently • Yet, improvement is needed
Illustration: Independence between Description and Content Source http://en.wikipedia.org/wiki/File:Mpeg7image1.svg :
How it works Source http://en.wikipedia.org/wiki/File:Mpeg7image1.svg :
Speech Indexing • Search through speech transcripts • Finds familiar metaphor of free text search • Automatic speech recognition (ASR) • Indexed transcript → semantic information • Main advantage : Representation • Speech is built of words
Shot Boundary Detection • Shot Boundary Detection(SBD) algorithm • Completely automatic • Key frames are selected and extracted • Saved as JPEG files • High Accuracy and Efficiency • Still, fault detection problem is unsolved
Definitions Basic Concepts • Frame: composed of picture elements just like a chess board • Key frame: Represents shots • Shot: Group of frames which represents similar frames • Start key frame • End key frame • Animation
3 levels of Video Browsing • Browsing a large Collection of Videos • Browsing a ranked list of videos • Browsing a single video to find relevant segments The concern is the second and the third one to extract the most important segment from the video content.
SBD • Key to Efficient Video Visualization is accurate detection of boundaries • A shot is continues Sequence of frames as captured by the camera • Often represented by single key frame in the storyboard • Shot Boundaries: Changes between shots • Created during editing phase (Hard cut, Fade, Dissolves) • Can be gradual or abrupt
SBD Algorithms • Four shot boundary detection algorithms 1.Color Histogram Differences: the best and most balanced “older” algorithm: Hard Cut editing 2.Edge Change Ratio: the recently proposed algorithm: used for Hard cut, Fade and Editing 3. Standard Deviation of Pixel Intensities: For fade 4.Contrast: For dissolve
Time Scale Modification of Audio Signals • Efficient video browsing needs efficient audio browsing • Except images, most digital contents are audible • Faster audio browsing is necessary • TSM : allow speeding up or slowing down audio w/t noticeable distortion • By skip pitch periods to speed up duplicate when you want to slow down • Human speech signals are quasi-periodic • Changing total play time: deleting or inserting small audio segment
Improvement of TSM Time-Domain Harmonic Scaling(TDHS) technique Foundation and general formulation Time-Domain, Pitch Synchronous Overlap Add Simple time Domain Time Scale Modification (TSM) algorithm Modern speech TSM algorithm Optional and applicable to all MPEG4 audio coding Scheme Pointer Interval Controlled Overlap Add • Waveform Synchronous Overlap(WSOLA) Used in the paper
Storyboards, Moving Storyboards and Animation • Storyboard • a set of one or more pages, each consists of a two dimensional array of key-frames, sorted in chronological order. • Animation • a quick slide show, where each of the key-frames is shown for a fixed short period (e.g., 0.6 seconds) • Moving Storyboard (MSB) • the animated key frames, fully synchronized with the original audio track. Each key-frame is shown for the entire duration of the associated shot. Example.http://www.youtube.com/watch?v=-l4Xzak9LpM
Adaptive Accelerating Fast Playback • Very fast video playback (without audio) • Ordinary fast forward depends only on speed • There is a chance to miss important scene • Accelerates until new scene is met • Requires less computation load
Conclusion • Multimedia Browsing not as simple text browsing • Studies on efficient video browsing is still underway • Active accelerating fast playback • Most useful at analyzing surveillance videos • SBD: Useful for visual contents • TSM: Useful for Audio contents • Efficient Video retrieval implements the above technologies
Questions • Explain The different Levels of MPEG-7 description method of Visual Content • What Method is appropriate for Efficient Audio Retrieval • Is MPEG-7 a content compressing tool? If No why?, Who standardized it what is the name of its standard@ • What Method is efficient way visual content retrieval • Explain the difference that exists among Shot, Key frame and shot boundary
References • Shot Boundary Detection • http://muvis.cs.tut.fi/sbd.html • http://link.springer.com/chapter/10.1007%2F11795131_95?LI=true#page-1 • Key frame • http://en.wikipedia.org/wiki/Key_frame • Synchronous Overlap-Add • http://www.surina.net/article/time-and-pitch-scaling.html • Growth of Digital Information Created and Replicated http://www.emc.com/leadership/digital-universe • MPEG-7 standard http://www.en.wikipedia.org/wiki/File:Mpeg7image1.svg • PSOLA (Pitch Synchronous Overlap and Add) http://en.wikipedia.org/wiki/PSOLA