Real-time and Retrospective Analysis of Video Streams using MPEG-7

Real-time and Retrospective Analysis of Video Streams and Still Image Collections using MPEG-7 Ganesh Gopalan, College of Oceanic and Atmospheric Sciences, Oregon State University

Introduction • HD video streams have potential to improve understanding of deep sea eco-systems • However, volume and complexity associated with the HD streams and formats can be overwhelming • Our approach: Use industry standards to transform video into a data type vs. treating it as viewing material

MPEG-7 Overview • Multimedia content description interface • Consists of low-level descriptors and high-level description schemes • Low-level descriptors provide statistical information about the pixel values in content • Description Schemes are used to represent semantic information

Low Level Descriptors • Structures that describe content in terms of the distribution of edges, colors, textures, shapes and motion • Descriptors extracted using MPEG-7 Experimental Model (XM) software • The input is a still image or a frame from video • The output is an XML description of the statistical information

Examples of Low Level Descriptors • Edge Histogram • Homogeneous Texture • Color Layout • Color Structure • Motion Activity • Descriptors are rotation and scaling invariant

Descriptor Extraction and Search • Phase 1: descriptor XML for collection of frames/still images is generated and cached • Phase 2: difference between query image descriptor from those values cached in phase one is computed • The cache can be augmented with the descriptors from a new video or still image collection

Description Schemes • Description Schemes attempt to model the reality behind the content • Low level descriptors can be used to tag objects of interest; the tags are then used to construct a high level description • A search can then be performed against the higher level description schemes

High Definition Video Search Engine • Applied MPEG-7 to the development of an HD search engine • Extracted descriptors for approximately 10,000 frames from 2.5 hours of high definition content • Content provided by the University of Washington from “Visions 05 Cruise” • Also applied to search for eddies in satellite image collections; super-cells in radar images

Application Architecture • .NET Windows Forms front end with an embedded Windows Media Player • SQL Server back-end • Common Language Run-time Integration for development of stored procedures to manage MPEG-7 XML • Procedures can be written in .NET languages rather than SQL

Creating a CLR Stored Procedure CREATE FUNCTION FindUsingVisualDescriptor ( @uid int, @token uniqueidentifier, @queryImage varbinary(MAX), @descriptorName nvarchar(256) ) RETURNS nvarchar(MAX) AS EXTERNAL NAME MPEG7Document.StoredProcedures.FindUsingVisualDescriptor; GO

Creating an HTTP Endpoint CREATE ENDPOINT MPEG7 STATE = Started AS HTTP ( SITE = ‘XXX.XXX.XX.XXX', PATH = '/MPEG7Endpoint', AUTHENTICATION = (BASIC), PORTS = (SSL), SSL_PORT = 444 ) FOR SOAP (WEBMETHOD 'FindUsingVisualDescriptor' (NAME = 'looking.dbo.FindUsingVisualDescriptor', FORMAT = ALL_RESULTS), …)

User Interface • UI allows conversion of video into frames using ffmpeg • Descriptors of choice are then generated for all frames • Descriptors are persisted to the server

Retrospective Search • A query image initiates the search • The descriptor value for the given image is compared with those cached from the video frames or still images • The top 100 frames that are closest to the query image are returned

Retrospective Search Example

Real-time Event Detection • In this case, we have a set of known images that have objects of interest • Descriptors of frames from a real-time stream are compared on a continuous basis with those in the “event library” • When the difference in descriptor values is below a threshold, an event has been detected

Example of an Event

Reference Event

Use of Multi-Core Systems • The descriptor extraction process can be made faster by taking advantage of multiple processors or cores • The total number of frames can be divided up amongst the available processors • Threads extract the descriptors concurrently to generate chunks of XML • The threads then signal each other to combine the chunks into a single file with the descriptor XML

Challenges • Shadows and other lighting issues can create false positives • May be necessary to use multiple descriptors for classification • Processing high definition video at 30fps is computationally intensive • Scaling to a large number of images such as on the web presents a challenge

Conclusion • MPEG-7 supports a rich framework for content-based searches through its low level descriptors • Detected content can be tagged effectively using the high level description schemes that can be used to locate, search through and distribute content

Future Directions • Need to explore ways to speed up descriptor extraction using GPUs or hybrid GPGPUs. • Explore Cloud Services to implement video services – transcoding video on the fly for different devices, descriptor extraction using HPC clusters, streaming services • Explore the Surface Computer as a UI

Acknowledgements • We are thankful to Professor John Delaney from the University of Washington for providing the HD footage • We are also thankful to the NSF funded LOOKING team for supporting this effort

Real-time and Retrospective Analysis of Video Streams using MPEG-7

Real-time and Retrospective Analysis of Video Streams using MPEG-7

Presentation Transcript

Comparison and Analysis of Selected Video Encryption Algorithms Implemented for MPEG-2 Streams

Monitoring Academic Conferences: Real-time Visualization and Retrospective Analysis of Backchannel Conversations

Detection and Classification of Vehicles from a video using Time-Spatial Image

Real time analysis and visualization

MPEG Video Coding — MPEG-2

MPEG Video Coding II — MPEG-4, 7 and Beyond

Video Summarization using MPEG-7 Motion Activity and Audio

Prospective and Retrospective Perception of Time

MPEG Video Coding II — MPEG-4, 7 and Beyond

MPEG-4 streams

Visual Analysis of Image Collections

Selective Retransmission of MPEG Video Streams over IP Networks

Efficient MPEG Compressed Video Analysis Using Macroblock Type Information

MPEG-2 Transport streams

Image Annotation using XML and MPEG-7

Real-time Audio and Video Conferencing

Design and Analysis of Real-Time Software

Video Image Analysis

Content Based Image Retrieval Using MPEG-7 Dominant Color Descriptor

CSE 577 Image and Video Analysis