390 likes | 783 Views
The MPEG Standard. MPEG-1 (1992) actually a video player plays out audio/video streams same type of access as home VCR MPEG-2 (1995) introduced for compression and transmission of digital TV signals still limited interactivity MPEG-4 (1999) is completely different
E N D
The MPEG Standard • MPEG-1 (1992) actually a video player • plays out audio/video streams • same type of access as home VCR • MPEG-2 (1995) introduced for compression and transmission of digital TV signals • still limited interactivity • MPEG-4 (1999) is completely different • high level of interactivity • MPEG-7 (2002) for the description of metadata only The MPEG Standard
MPEG-4 • MPEG-4 addresses the need towards • Mixing of natural and synthetic audiovisual information • High interactivity in the presentation of multimedia content • Deployment of communication systems for real-time or broadcast delivery of coded data streams • A new approach for describing, coding and presenting a scene • MPEG-4 combines different coding tools for • Audio/video • Synthetic objects and graphics The MPEG Standard
MPEG-4 Objects • Theaudio/video components of MPEG-4 • Objects are coded, transmitted separately and composed at the decoder site • They can exist independently • Multiple objects can be grouped together to form complex objects • Video and audio can be easily manipulated • Permits choosing appropriate coding tools for audio, video and graphics objects The MPEG Standard
MPEG-4 Object Based Coding The MPEG Standard
MPEG-4 Coding • The scene is composed and rendered at the sender site • video frames, audio are coded, multiplexed and transmitted • tools for coding arbitrarily shaped objects • At the receiver the stream is demultiplexed • video and audio are decoded, composed, synchronized and presented as defined at the senders site The MPEG Standard
Object Coding • Objects are described mathematically (e.g. by their positions) • similarly for audio and graphics objects • an object need only be defined once • the viewer can change their position • transmit calculations to update the scene at the receiver • this is a critical feature when the response has to be fast and bit-rate is limited The MPEG Standard
Binary Format for Scenes (BIFS) • MPEG-4’s language for describing and dynamically changing a scene • Borrows concepts from VRML • Both define representations of the same data • VRML defines objects and actions in text • BIFS code is binary (10-15 times shorter) • Unlike VRML, MPEG-4 uses BIFS for real-time streaming: a scene can be built-up and played on the fly • VRML and BIFS evolve consistently The MPEG Standard
scene graph The MPEG Standard
The Scene Graph • Represents a scene as independent or compound objects e.g., • father and child • the audio track of his voice • floor and walls (sprites: for backgrounds) • the web site • the synthetic image of the furniture • a synthetic HDTV set playing a movie from the families DVD library The MPEG Standard
Elementary Streams (ES) • The scheme for preparing content for transmission, storage and decoding • Objects are placed in ESs • Probably two or more ESs per object • A sound track or a video may have a single ES • Scalable objects way have one ES for basic quality information + one or more enhancement layers for improved quality (e.g., finer detail, faster motion) • ESs are split into packets and sent along with timing information for proper synchronization The MPEG Standard
Object Descriptors (OD) • MPEG-4s mechanism that informs the system which ES belongs to a certain object • OD contain Elementary Stream Descriptors (ESD) which tell the system which decoders to use • ODs are sent in their own stream which allows them to be added or deleted as the scene changes The MPEG Standard
Profiles and Levels • MPEG-4 provides a set of tools for coding multimedia contents • an application may use only subsets of these tools • Profiles: MPEG-4s definitions of these subsets for audio, visual, graphics information • Levels: define the computational complexity of the profile’s tool subset • Certain combinations of profiles fit well together The MPEG Standard
MPEG-4 Profiles The MPEG Standard
MPEG-4 Visual Objects • Arbitrarily shaped objects are coded apart from their background • Binary shape coding: a pixel is or is not part of an object • simple, crude technique, suitable for low-bit rates, suffers from aliasing • Alpha shape (gray scale) coding: each pixel is assigned a value for its transparency • objects can be smoothly blended into a background or with other objects The MPEG Standard
Visual Objects • Rectangular natural images and scenes are coded using MPEG-1, 2 • Texture is coded separately by a DCT, block based coding scheme or wavelets • E.g., weather reports: the weatherman’s image seems to be standing in front of a map which is actually generated elsewhere The MPEG Standard
Object Segmentation • MPEG does not specify how objects are extracted • video object segmentation is difficult • e.g., record weatherman’s image in front of a color background • MPEG-4 specifies decoding • implementation of encoding is left to the industry to decide The MPEG Standard
MPEG-4 Applications • MPEG-4 makes video possible even at very low bit-rates (e.g., 10 kb/s) • mobile devices, internet • Scalable objects for low bit-rates • a base layer conveys all the information in some basic quality • one of more enhancement layers can be sent to get better quality • send only the most important objects The MPEG Standard
Sprites • For coding unchanged backgrounds • The background is defined and coded only once • Must be updated for each change (e.g., when the viewing angles changes) • The sprite is sent only once • New views are created by sending the new positions The MPEG Standard
Advanced Features • Map images into computer generated shapes • a 2D or 3D mesh may have an image mapped onto it • a few parameters to deform the mesh generate the impression of a moving picture • rather than sending new images for each change, send commands and parameters to the viewer • pre-defined faces are particularly interesting meshes • the appearance of a face may be left to the decoder (e.g., custom facial models can be downloaded) The MPEG Standard
MPEG-4 Faces • Images laid over a wire-frame face • Send wire-frame plus parameters • Image reconstruction at receivers site • Speech is generated from text in steps with motions of the mouth, eyes and lips The MPEG Standard
MPEG-7 • MPEG-7 (2002) focuses on description of multimedia content • modalities: image, speech, video, graphics and their combinations • MPEG-7 complements existing MPEG standards and is applicable even to non-MPEG formats (compressed or uncompressed) • MPEG-7 is driven by trends in technology, market and user needs • Applications: VideoOnDemand, NewsOnDemand, InteractiveTV, multimedia information systems etc. The MPEG Standard
Scope of the Standard Provides the means for indexing, searching, filtering and managing audio-visual content broadcast media selection (e.g., personalized TV) multimedia editing (e.g., personalized news service) MPEG-7 interoperable interface defines syntax and semantics tools may be designed for specific modalities, aspects or applications The MPEG Standard
Interoperable Servicesand Applications The MPEG Standard
MPEG-7 Main Tasks • Multimedia: generate customized program guides or summaries of broadcast audio-visual content • Archive: generate descriptions of audio-visual content (or elements) • Adaptation: filter and transform multimedia streams in low bit-rate environments (e.g., mobile users) The MPEG Standard
MPEG-7 Specific Tasks • Music/audio: play a few notes and return music with similar music/audio • Images/graphics: draw a sketch and return images with similar graphics • Movement: describe movements and return video clips with the specified temporal and spatial relations • Scenario: describe actions and return scenarios where similar actions take place The MPEG Standard
MPEG-7 Elements • Descriptors (D) : define syntax and semantics of features of audio-visual content • Application independent • Low level: shape, motion, color, camera motion, harmonicity, timbre for audio ... • Semantic level: events, concepts ... The MPEG Standard
MPEG-7 Elements (cont.d) • Description Schemes (DS): specify the structure and semantics of the relationships among the constituent Ds or DSs e.g., • Video DS specify syntax and semantics for segment decomposition, attributes, their relationships • DS related to creation, production, and access of content (e.g., property rights, parental rating, etc.) The MPEG Standard
MPEG-7 Elements (cont.d) • Description Definition Language (DDL): allows flexible definition of Ds and DSs based on XML schema • Ds and DSs are application independent • DLLs to define specialized tools The MPEG Standard
MPEG-7 Descriptions • MPEG-7 allows descriptions at different levels of abstractions • low level features extracted automatically • semantic features with human interaction or textual annotation • MPEG-7 does not specify how features are extracted or used (e.g., filtering, retrieval) • their representation must conform to the MPEG-7 standard The MPEG Standard
MPEG-7 Parts • Systems: specifies functionality at system level • Preparation of descriptions for efficient transport and storage • synchronization of content and descriptors • development of decoders • Description Definition Language (DDL): language for specifying new Ds and DSs • extension of XML schema The MPEG Standard
MPEG-7 Visual • Specifies a set of standardized visual Ds and DSs • Color descriptors: color space, quantization • Texture descriptors: homogeneous texture, texture browsing, edge histogram ... • Shape descriptors: for regions or contours • Motion descriptors: camera motion, trajectories, motion activity ... • Face recognition The MPEG Standard
MPEG-7 Audio • Specifies standardized audio descriptors and descriptor schemes for pure music, pure speech, sound effects, soundtracks • silence descriptor • spoken content descriptors • sound effects descriptors • melody contour descriptors The MPEG Standard
Multimedia Description Schemes • Specify a framework that allows generic description of all kinds of multimedia data • basic elements: data types, structures, Ds • content management: content from several viewpoints (creation, usage etc.) • organization of content by collections, classification • navigation and access • user interaction The MPEG Standard
Multimedia Description Schemes The MPEG Standard
MPEG-7 Reference Software • Reference implementation of the relevant parts of the MPEG-7 standard • The focus is on creating bit-streams of descriptors and description schemes (DDL parser, DDL validation, multimedia description schemes) • Some software for extracting descriptors is also included (visual, audio descriptors) The MPEG Standard
References • “MPEG-4 Multimedia for our Time” R. Koenen, IEEE Spectrum, Feb. 1999, pp. 26-33 • “Applying and Implementing the MPEG-4 Multimedia Standard”, J. Kneip et.al. IEEE Micro, Nov-Dec 1999, pp. 64-74 • “Overview of the MPEG-7 Standard”, S.-Fu Chang, T. Sikora and A. Puri, IEEE Transactions on Circuits and Systems for Video Technology, special issue on MPEG-7, June 2001 • “Everything You Wanted to Know about MPEG-7” F. Nack and A.T. Lindsay, Part I, II, IEEE Multimedia, Aug-Dec1999 The MPEG Standard