460 likes | 472 Views
Explore the advanced technology behind MPEG-4 video coding standards. Learn about motion and texture coding, scalable and sprite coding, error resilience, applications like digital terrestrial TV, interactive video games, and more. Discover existing coding standards like MPEG-1, MPEG-2, H.261, H.263, JPEG, and JBIG, as well as functionalities such as content-based manipulation, object-based coding, and flexible access editing. Benefit from low-complexity, robust error resilience, and object-based hyperlink manipulation. Dive into video object definitions, coding structures, decoding, texture tools, and motion compensation techniques.
E N D
MPEG-4 Video Xuemin Chen and Bob Eifrig Advanced Technology Department Instructor: L. J. Wang, Dept. of Information Technology, National Pingtung Institue of Commerce X. Chen, B. Eifrig
Outline • Introduction to MPEG-4 video • Basic definition and concepts • Motion and texture coding for video • Scalable coding • Shape coding • Sprite coding • Still texture coding • Error-resilience X. Chen, B. Eifrig
Video Applications • MPEG-2 • BSS(broadcast satellite service) • CATV(cable TV) • DTTB(digital terrestrial TV b’cast) • EC(Electronic Cinema) • HTT(Home TV Thratre) • IPC(video phone...) • MMM(multimedia mailing) • NDB(network database svc, e.g. VOD) • RVS(remote video surveillance) • SSM(serial storage media e.g. digital VTR) • MPEG-4 • IMM(internet multimedia) • IVG(interactive video game • IPC(video phone...) • ISM • MMM • NDB • RVS • WMM(wireless multimedia X. Chen, B. Eifrig
MPEG-4 video applications X. Chen, B. Eifrig
Requirements on Functionalities • Content-based manipulation and bitstream eidting • Object-based coding (e.g. Chroma-keying) • View of contents of video in different resolutions, refresh rates, and quality. • Transmission of video over error-prone environments (e.g. wireless communication networks) • Coding “Sprite” for game applications, and more ... X. Chen, B. Eifrig
Existing Coding Standards • MPEG-1 (Progressive, frame-based,bit rate ~1.5 mbps...) • MPEG-2 (MPEG-1 + frame-based Interlaced coding, frame-based scalabilities, error resilience, fixed frame rate, ITU-R 601 resolution, bit rate ~ 4 mbps ...) • H.261 (CIF and QIF resolution, frame-based, No B-frame, No quant matrix, bit rate ~ px64 kbps...) • H.263 (H.261+ PB-frame, OBMC, bit rate > 10kbps ...) X. Chen, B. Eifrig
Existing Coding Standards • JPEG (frame-based, limited scalabilities,...) • JBIG (frame-based, still binary image only...) MPEG-4 Functionalities • Syntax and techniques to support content-based manipulation and bitstream editing without the need of transcoding . X. Chen, B. Eifrig
MPEG-4 Functionalities • Flexible in the level of access, editing and manipulation are performed in arbitrary-shape video object with fine granularity in contents, spatial resolution, temporal resolution, quality and complexity. • Video object-based scalability. • More powerful error-resilience tools • Scalable still texture coding tools X. Chen, B. Eifrig
Benefits • Low-complexity, reduced memory and bandwidth requirement, low-delay and no quality loss; • Object-baed hyper-link, manipulation and editing • Address video game applications • More robust in error-prone environments X. Chen, B. Eifrig
Definitions • Video Object Definition and Format VO3 VOP VO2 ...... ... VOP VO1 • Video Object(VO) and • Video Object Plane(VOP) 3 2 ...... 1 (1) YUV 4:2:0 (8-bit). (2) Segmentation Masks. (3) Alpha Plane X. Chen, B. Eifrig
Example X. Chen, B. Eifrig
Video Coding Structure • Video Syntax Hierarchy • Visual Object Sequence(VOS) • Video Object(VO) • Video Object Layer (VOL) or Texture Object Layer (TOL) • Group of Video Object Plane (GOV) • Video Object Plane (VOP) • Special Case--Single Rectangular VO Case • VOS “=“ VO <==> Video Sequence • VOL or TOL <==> Sequence Scalable Extension • GOV <==> GOP • VOP <==> Frame X. Chen, B. Eifrig
Video Syntax Hierarchy X. Chen, B. Eifrig
Basic Visual Decoding + Scaleable coding Sprite coding Error resilience coding X. Chen, B. Eifrig
Video Decoding X. Chen, B. Eifrig
Texture Coding Tools • Hybrid DCT coding tools similarities to MPEG-2 • 4:2:0 chroma format (4:2:2 & 4:4:4 are not part of MPEG-4) • I, P & B pictures (VOPs) • DCT • Quantization • 16x16 prediction • field prediction & frame/field DCT interlace • MPEG-4 visual texture coding is about 10-20% more efficientthan MPEG-2 • Comparing bits at 4 Mbps for 601-size interlaced videoat constant quantizer; frame structure; N=15, M=3 X. Chen, B. Eifrig
New Hybrid DCT Texture Coding Tools • Unrestricted Motion Compensation • Overlapped block motion compensation • Median based motion vector predictors in P-VOPs • Intra MBs • AC/DC prediction • non-linear DC quantization • multiple scan paths • MPEG-2 or H.263 quantization • 8x8 mode in P-VOPs • Direct mode in B-VOPs • 3D (run, level, last) VLC • MV round toward half-pel • Variable picture rate • Limited quantizer change X. Chen, B. Eifrig
Unrestricted Motion Compensation • Reference block can be partially outside the picture(more generally partially outside object) • Convex image is constant extended • 2 or 4 point average defines pixel for non-convex case X. Chen, B. Eifrig
Overlapped Block Motion Compensation • Applied on 8x8 block basis • Weighted average of 5 predictionblocks formed by MVs from • Current block • left, right, top neighbor blocks • Current is used for bottom neighbor • MV=0 if neighbor is intra Top weigh Bottom weight Current block Left Right X. Chen, B. Eifrig
MV Predictors for P-VOPs PMV[MV] = median(MV1, MV2, MV3) X. Chen, B. Eifrig
Intra DC prediction (Graham’s Rule) X. Chen, B. Eifrig
Intra AC Prediction Controlled by a flag at macroblock layer Use Graham’s Rule to determine the pre- diction direction: (horizonal vs. vertical) X. Chen, B. Eifrig
H.263 (~ MPEG-1) Quantization X. Chen, B. Eifrig
Multiple Scanning Paths Alternate horizontal Alternate vertical Zigzag X. Chen, B. Eifrig
Direct Mode in B-VOPs (No intra-MB in B-VOPs !) • only mode that allows 8x8 in B-VOPs • Progressive X. Chen, B. Eifrig
Interlaced Direct Mode X. Chen, B. Eifrig
3D Variable Length Coding X. Chen, B. Eifrig
Dquant VLCs & MV division • Dquant VLCs P-VOPs B-VOPs • MV division tables (from chroma & fld->frm prediction) X. Chen, B. Eifrig
Generalized Scalability • Block diagram • Spatial scalability X. Chen, B. Eifrig
Temporal Scalability Type 1 with I & P-VOPs Type 1 with B-VOPs X. Chen, B. Eifrig
Enhancement Types X. Chen, B. Eifrig
Shape Coding • Binary alpha shape • Gray-level alpha shape • Temporal Scalability + Shape coding • Interlaced tools + shape coding • Sprite + shape coding • Spatial scalability + shape coding X. Chen, B. Eifrig
Binary Shape Sequences • Binary alpha block(bab) • 7 types of BABs I-VOP P-,B-VOP time X. Chen, B. Eifrig
Binary Shape Decoding X. Chen, B. Eifrig
Context-based Arithmetic Coding • Arithmetic coding bypasses the idea of replacing an input symbol with a specific code. It replaces a stream of input symbols with a single floating-point output number. • A context number is computed based on a template. • The context number is used to access the probability table • Using the accessed probability value, the next bits of binary_aruthmetic_code are decoded to give the pixel value. ? ? X. Chen, B. Eifrig
Some details on arithmetic coding LPS,MPS (Less Probable Symbol) Interval A Context Probability table (P(0)) A LPS Interval : A*PLSP arithmetic code(ac) MPS Interval : A*(1-PLSP) 0 X. Chen, B. Eifrig
Sprite Coding • A sprite is an image composed of pixels belong to a video object that are visible throughout an entire video segment. • Basic sprite coding • low-latency sprite coding • scalable sprite coding X. Chen, B. Eifrig
Example of Sprite Coding • Coding background of a video conference room (much large than a single frame of video • coded as the I-VOP • sprite pieces use INTRA quant • update pieces use INTER quant • shape information is sent in the VOL as part of I-VOP • Shape and texture coding • Warping X. Chen, B. Eifrig
Sprite Decoding Process X. Chen, B. Eifrig
Warping and Sample Reconstruction • Any pixel (i,j) inside the VOP “warping position” ( F(i,j), G(i,j)), (Fc(i,j),Gc(i,j)) are computed on a basis of no_of_sprite_warping_point. • Reconstructed sample values are computed from sample values at the location (F(i,j),G(i,j)), (Fc(i,j),Gc(i,j)). X. Chen, B. Eifrig
Still Texture Coding • Applications • internet images • medical images • Requirements • Scaleable (e.g. 4kx4k------ 64x64) • Support lossless, almost lossless, and lossy • Support arbitrary shape • and more... X. Chen, B. Eifrig
Still Texture Coding • Wavelet-based texture coding • DWT and subband decomposition • quantization of the wavelet coefficients • coding the LL band (PCM coding) • zero-tree scanning of higher order bands • entropy coding X. Chen, B. Eifrig
Still Texture Coding X. Chen, B. Eifrig
Error Resilience • Applications : Wireless communication networks, e.g. GSM and CDMA. • physical (layer) channel ------ logical channels, e.g. signaling channels, data channels, and speech channel, etc. Each logical channel often uses different error protection. Inside the speech channel, data are partioned into classes. Different class uses different protection. • video channel in the future. • Requirements • Resyn. • Data Partitioning • etc.. X. Chen, B. Eifrig
Error Resilience Coding • Resynchronization • video packet approach (similar to GOB in H.263) : the length of the video packets are not based on the number of MBs, but the bits contained in the packet. • Data recovery • Reversible VLC • Error concealment • Data Partitioning X X motion marker Resync marker texture info. MVs ....... X. Chen, B. Eifrig
THE END • It is only with understanding of our history, our neighbors, and our prospects for the future that we can alter and appreciate our own times and circumstances. X. Chen, B. Eifrig