311 likes | 326 Views
Learn the essence of video formats & standards, from digitization to MPEG-2 delivery. Explore encoding principles, compression techniques, and MPEG-4 applications for multimedia. Discover MPEG-2 layered coding and profiles for professional content management.
E N D
Professional Content Management Systems3rd Lecture: Essence Formats and StandardsDr. Andreas MautheSCC – Lancaster University
Essence Encoding Principles • Digitisation • Transformation from the continuous to the digital domain • Information loss • Quality depending on number of bits used for representation (i.e.sampling rate and quantisation interval) • Compression • Reduction of bit rate • Exploiting redundancies & properties of human senses • Lossless vs. lossy compression • Basic compression techniques • Entropy coding, source coding, hyprid coding
Video Encoding • Principles • Basic Image Elements • Pixels, aspect ration (4/3), colour representation (RGB vs. YUV) • Motion • Presentation of images faster than 15 frames/sec • PAL 25 frames/sec, NTSC 29.97 frames/sec • Digitisation • Basic steps • Sampling: into an array of MxN points • Quantisation: commonly into 256 values • Coding: composite vs. component • Component coding standards • Different component representation, sampling frequencies & sampling/ lines • The MPEG-1 Standard • Video & audio standard • Does not standardise encoder but syntax and semantics of MPEG1 bit stream • JPEG like picture coding and compression • Different Picture types • I,P,B & D frames • GoP
MPEG-2 • Target Application Area • Video & media production • Television, HDTV • Interactive Television • Interactive multimedia services • Participating Standardisation Bodies • ISO/ IEC • ITU-TS (telecommunication), ITU-RS (radio/ broadcast) • EBU • SMPTE • Encoding Principles • Similar to MPEG-1 • Specifies the video bit stream syntax • Similar image coding • 8x8 blocks • DCT transformation • Variable run-length coding • I,P,B & D frames • Compression of interlaced video for standard TV • Field pictures • Encoding of fields as independent entities • Frame pictures • Each interlaced field pair is interleaved into a frame • Divided into macro blocks and encoded
MPEG-2 Layered Coding • Objectives • Different layers for better scalability • Build on top of each other starting with the base layer • Layered Coding Modes • Spatial scalability • Different resolutions that build up to the full resolution • For transmitting video at a range of different quality levels, e.g. standard TV & HDTV • Temporal scalability • Contains sequences at a lower frame rate • Different layers build up to the full frame rate • Distribution of I, P & B frames between layers are important • Data partitioning • Data encoded and layered according to priorities • High priority streams contain DC coefficients and motion vector headers • For progressive image build up • Signal-to-noise ratio scalability (SNR) • Basic quality version • Enhancement layers carry information to build up to full quality
MPEG-2 Profiles & Levels • Objectives • Profiles • To support different applications • Professional vs. consumer formats • Different chrominance sampling modes • 4:4:4, 4:2:2 & 4:2:0 • Successive profiles include preceding profiles • Levels • Specifying a sub-set of spatial and temporal resolutions per profile • Supporting a large set of image formats
MPEG-2 Delivery • MPEG-2 Stream Types • MPEG-2 Program Streams • Elementary stream • Specified for multimedia applications • MPEG-1 compatible • MPEG-2 Transport Stream • For the transmission of multiple programmes and program streams • Fixed sized packets identified by packet ID • Multiple Program Streams can be mapped into one Transport stream • E.g. different video and audio streams • Applications • Video telephony to digital TV • Networks • Lossy and high bandwidth, e.g. Fibre, satellite, cable, etc.
MPEG-4 Motivation: To support the predicted convergence of media and technology areas, viz. communications, computing, television and entertainment • Target Application Area • Mobile telephones & devices • Interactive multimedia applications • Media production • Broadcasting • Functionality Areas: • Content based interactivity • Multimedia access tools, content based manipulation, hybrid, natural and synthetic data coding, etc. • Optimised compression • Through improved coding efficiency • Universal access • Supporting different networks ranging from high-speed to wireless • F. Pereira, T. Ebrahimi (Editors): “The MPEG-4 Book”, IMSC Press Multimedia Series, Prentice Hall, 2002.
MPEG-4 (Object Oriented) Coding • Basics • Specifies bit stream as MPEG-1 & MPEG-2 • Motion-compensated hybrid DCT • Motion Decoding and Compensation Tool block based video coding • Shape Decoding Tool object based video coding • Basic Components • Audio Visual Objects (AVO) • Arbitrary shape • Spatial & temporal extent • Described by object descriptors (OD) • AVO carried in separate elementary streams • Scene Composition • Scene Descriptor Information • Composition information using pointers to AVO • Specifies relationships between AVO • Spatial position • Temporal position • Also specifies • dynamic behaviour • Allowed interactivity pattern • Scene description language • Binary Format for Scenes (BIFS)
MPEG-4 Profiles, Levels & Object Types • Profiles & Levels • Profiles • Specify the tool set to be supported • Levels • Set complexity bounds • Required memory, number of objects, bit rate • Conformance • Bitstream • Contains syntactic elements of profile • Stays within boundaries given by level • Decoder • Ability to to interpret values of all allowed syntactic elements • Provides all required resources • Complexity bounds on all objects within a scene • Object Types • Specify syntax and semantic of an object • Tools required to code an object • Specifies restrictions on the combination of object types
Digital Video (DV) • Background • Consumer DV • International Electronical Commission: “Helical-scan digital video cassette recording system using 6,35mm magnetic tape for consumer use (525-60, 625-50, 1125-60 and 1250-50 systems)” IEC 61834 • Professional DV • Society of Motion Pictures and Television Engineers: “6.35mm type D-7 component format – video compression at 25 Mb/s – 525&60 and 625/ 59”, SMPTE Standard for Television Digital Recording SMPTE 306M, 1998 • Encoding Basics • Coding • Consumer: 4:2:0 • Professional: 4:1:1, 4:2:2 • 13.5 MHz sampling rate, 8-bit coding • Intra-frame compression only • DCT • Adaptive intra-frame spatial compression • Increased motion results increased compression Variable bit-rate to constant bit-rate • Compression ration 5:1 • Error correction • Rank (inner) • File (outer)
Professional DV • Supported TV Formats • NTSC • 525 lines (480 active lines) • 29.97 frames/sec • PAL • 625 lines (576 active lines) • 25 frames/sec • 6.35 mm tapes • Helical track • DV 25 • Coding • 4:1:1 • 1 video & 2 independent audio channels • Bit rate: 25 Mb/s • DV 50 • Coding • 4:2:2 • 1 video & 4 audio channels • Bit rate: 50 Mb/s
CMS and Video Formats • Non-Standard Formats (tape based) • Analog component formats • Betacam, Betacam SP, M-2 • Digital formats • Composite • D-2, D-3 • Component • DigiBeta (proprietary compression) • D-5 (no compression) • Standard Digital Formats • DV based • DVCPro 25, DVCPro 50 • D-9 (50 Mb/s) • DVCAM (4:2:0, 25 Mb/s) • MPEG based • Beta SX (4:2:2, IB frames, 21 Mb/s, tape based) • D-10 (4:2:2, MPEG-2 4:2:2P@ML, I frame only, 50 Mb/s) • Requirements • Handle multiple formats • Multiple data rates • Browse <= 1.5 Mb/s • Broadcast 2Mb/s – 8 Mb/s • Production 18 Mb/s – 50 Mb/s
Typical Video Essence Related Issues • Heterogeneous Formats • Generation loss • Lossy encoding leads to qualtiy detoration at every: • Decoding • Editing • Re-encoding • Playout and subsequent archival • Format change results in quality loss • Long term archiving on obsolete formats • Different formats in the production and transmission chain • Not yet a standardized file format
Infrastructure Related Essence Issues • Heterogeneous Communication Infrastructure • Broadcast communication networks • Analog point-to-point • Digital point-to-point via Serial Digital Interface (SDI, ITU-R BT.601-5, 270 Mb/s streaming) • Digital point-to-point via Serial Data Transport Interface (SDTI, DVCPRO/MPEG) • Computer networks • Ethernet based LANs • ATM based LANs & WANs • Internet • Machine connections • Fibre Channel, SCSI • Machine Control Networks (e.g. RS 422)
Audio • Encoding Basics • Digitisation • Analog-to Digital Conversion (ADC) • Sampling of waveform • CD sampling rate 44.1kHz (i.e. 44100 samples/sec) • Telephony 8 kHz • Quantisation • Pulse Code Modulation (PCM) • 16 bit for CD • Differential Pulse Code Modulation • Example: CD encoding • 2 * 44100 1/s * 16 bit = 1,411,200 bit/s • Uncompressed Digital Audio Formats • WAVE • Reference format • File format • 44.1 kHz, PCM dual stereo audio • Digital Audio Tape (DAT) • 48 kHz, PCM encoded
MPEG Based Audio Formats • MPEG-1 Audio • Coding formats • Sampling rate • 32 kHz, 44.1 kHz & 48 kHz • Quantisation: 16 per sampling value • Compression • Split into 32 non-interleaved sub-bands • Frequency transformation • Fast Fourier Transformation (FFT) • Quantisation • Psycho-acoustic model to determine noise level per sub-band • Higher noise level equals bigger quantisation steps • Entropy encoding • Channels • Two independent, two channel stereo, joint stereo • Layers • Downward compatible • Maximum bit rates • Layer I 448 Kb/s, Layer II 384 Kb/s, Layer III 320 • MPEG-2 Audio • Based on MPEG-1 • Supports half the sampling rates • Multiple channels (e.g. 5 channels surround, 7 different language channels)
MPEG-4 Audio • Target Application Area • Speech coding • General audio coding • Synthetic audio • Audio composition • Basics • Object oriented coding • Natural audio objects based on MPEG-2 • Improved coding efficiency and error resilience • Very low bit rates and very low delays • Bit rate scalability • Streams composed into audio scene • Through improved coding efficiency • Audio object types • Profiles • Conformance criteria for streams and decoder • Levels • Complexity units /processor and RAM complexity • F. Pereira, T. Ebrahimi (Editors): “The MPEG-4 Book”, IMSC Press Multimedia Series, Prentice Hall, 2002.
CMS and Audio Formats • Main Formats • Uncompressed formats • Based on • 44.1 kHz & 48 kHz PCM encoded audio • MPEG formats • MPEG-1 • MPEG-2 • Other (proprietary) formats • RealAudio • LiquidAudio • Requirements • Handle multiple formats • Multiple data rates • From a few Kb/s to around 1.5 Mb/s • New formats of up to 96 kHz • Multiple tools
Image Formats - JPEG • Basics • ISO/IEC JTC1/SC2/WG10 Joint Photographics Expert Group: “Information Technology – Digital Compression and Coding of Continuous-tone Still Images”, International Standard ISO/ IEC IS 10918, 1993 • Colour & monochrome images • Exchange format including • Image data • Coding tables & parmeters • Modes • Lossy Sequential DCT Base Mode • Expanded Lossy DCT Base Mode • Lossless Mode • Hierarchical Mode • Images in different resolutions
Further Image Formats • GIF (Graphic Interchange Format) • Basics • Platform independent exchange • Lossless compression scheme • Multiple interleave images • GIF Sectors • Header (GIF ID, algorithm ID) • Application (creation information) • Trailer • Control (controls presentation of a subsequent image block) • Image (image header, optional colour table and pixel data) • Comment • Plain text (to appear in an image) • TIFF (tagged Image File Format) • Basics • Baseline part • To be supported by every decoding and presentation applications • Extensions part • Binary, monochrome, colour (different colour pallets), RGB, etc. • Includes multiple coding schemes (e.g. JPEG) • Fields • Header Directory (byte order, version number), Structure (coding techniques), Fields (defines the image coding blocks), Data Fields (graphical objects not specified in advanced)
Stuctured Documents • SGML (Standard Generalised Markup Language) • Basics • Framework that defines the syntax of tags • Document Type Definition (DTD) required to define semantics • Tags to mark text elements • <start-tag> document element </end-tag> • Facilitates automatic processing • Processing instructions can be specified • SGML tag categories • Descriptive Markup (determines structure of document) • Entity References (placeholder) • Markup Declarations (determine Entity Reference types) • Processing Instructions (including audio & video types) • Web Page & HTML • Hypertext Markup Language (HTML) • Structure • DTD • Document header • Document body • Style sheets • Specification of appearance of reoccurring elements • Links to other Web documents
Essence Processing • Essence Processing Tools • Content segmentation • Temporal • Shots, cuts • Reconstructing an Edit Decision List • Spatial • Regions or objects • Metadata generation • Relying on analysable properties • E.g. motion detection • Automatic content description • Speech recognition • Transcripts • Keywords • Indexing • Face recognition • Programme classification • Content based retrieval • Image similarity • Fast browsing • Keyframes, skims, storyboards
Essence Processing: Basic Principles • Feature Extraction • Low level features • Colour histograms, dominant motion vectors, spectrum • Feature interpretation • Matching of low level features with logical concepts • Similarity retrieval • Based on low-level concepts • Example: Speech and sound analysis • Acoustic & phonetic analysis • Syntactical analysis • Semantic analysis