1 / 33

Introduction to H.264 Video Standard

Introduction to H.264 Video Standard. Anurag Jain Texas Instruments. H.264 Background. Jointly developed by ITU-T and MPEG. Upto 50% more efficient at the same virtual quality compared to MPEG-4 ASP

toki
Download Presentation

Introduction to H.264 Video Standard

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to H.264 Video Standard Anurag JainTexas Instruments

  2. H.264 Background • Jointly developed by ITU-T and MPEG. • Upto 50% more efficient at the same virtual quality compared to MPEG-4 ASP • Supports wide range of applications. (interlaced, progressive, low bit-rate, studio quality digital cinema etc). • Multiple profiles (Baseline, Main, Extended, High, FRExt). • Good results obtained from interoperability tests making it suitable for wide deployment in short span of time.

  3. Coding Control Intra Quantized Transform Coefficients Video Source Intra Prediction Quantization Transform + _ Inter Inverse Quantization Bit Stream Out Entropy Coding Inverse Transform Predicted Frame Motion Compensation + + Loop Filter Frame Store Motion Estimation Motion Vectors H.264 Encoder Block Diagram Quantization step more resolution for finer control of bit rate Intra Prediction Modes 9 4x4 & 4 16x16 modes = 13 modes [Single Universal VLC and Context Adaptive VLC] OR [Context-Based Adaptive Binary Arithmetic Coding] • Seven block sizes and shapes • Multiple reference picture selection • 1/4-pel motion estimation accuracy • Referenced B-frames Integer 16-bit fixed point transform with no mismatch

  4. Common Elements Common elements with other standards • Macroblocks: 16x16 luma + 2 x 8x8 chroma samples • Input: association of luma and chroma and conventional block motion displacement • Motion vectors over picture boundaries • Block Transform • Variable block-size motion • I, P and B picture coding types

  5. High Level Coding Tools • Sequence and Picture Parameter Sets (SPS & PPS) • Picture Order Count (POC) • Decoded Picture Buffer (DPB) • Slice group map (FMO) • Multiple slices and arbitrary arrangements (ASO) • Supplemental Enhancement Information (SEI) • Hypothetical Reference Decoder (HRD) • Video Usability Information (VUI)

  6. High Level Tools: Coding Hierarchy • A coded sequence contains one or more access units • An access unit is a set of NAL units that contains all necessary information for decoding exactly one (primary) coded picture • A coded picture is divided into Slices (VLC NAL units) • A slice contains a slice header and a set of macroblocks • A macroblock contains a 16x16 luma block and two chroma blocks • An I-slice contains a set of INTRA-coded macroblocks • A P-slice contains a set of INTRA- and INTER-coded macroblocks • An IDR (instantaneous decoding refresh) picture contains only I-slices (SI-slices too in extended profile)

  7. Sequence Parameter Set • Profile @ Level indicator • Profile constraint indicator • Sequence parameter set ID (0..31) • Picture order count type and infos • DPB (Decode Picture Buffer) info • Picture size • Frame/field coding flag • Method for vector derivation of B-direct mode • Frame cropping parameters • VUI_parameters (Annex E, Video usability information)

  8. Picture Parameter Set • Picture parameter ID (0..255) • Sequence parameter ID (0..31) • Entropy coding mode flag (CABAC/CAVLC) • Slice POC info presence flag • Slice group map parameters • Max. number (1..16) of ref. frames used for decoding slices • Weighted prediction flags • Quantization scales (qp minus 26, range -26 ..+ 25) • Chroma QP offset for loop-filter (-12 ..+12) • Slice loop-filter control flag (Alpha/Beta table offsets) • INTRA predication using pixels of INTER neighboring MBs? • Slice redundant pic. parameters presence flag

  9. Slice Header • Starting macroblock address • Slice type (I, P, B, SI, SP ) • Temporal reference (frame_num) • Picture parameter set ID (0..255) • Interlaced frame/field coding, top/bottom field indicators • IDR pictire ID (0,… 65536) • Slice POC parameters • Redundant picture count(0.. 127, 0 for baseline) • B-slice temporal or spatial direct mode indicator • Max. number (1..32) of ref. pictures for decoding current slice • Reference picture reordering parameters (DPB) • Weighted prediction parameters • DPB marking parameters (e.g. short term, long term pred. Pics) • Slice delta QP (-26 ..25) • SP switch flag and SP/SI slice QP • Loop-filter indicator (0: disabled, 1: enabled, 2: enabled but LP across slice Boundaries disabled) • Loop-filter alpha/beta table access offset (-6, +6) • Slice group change cycle (derives the No. of MBs in slice group 0)

  10. Slice Group Maps For error resilience

  11. Ordering of Slices within Slice Groups

  12. Low Level Coding Tools • Motion compensated prediction • Additional intra modes for spatial compensation • Transform: 4x4 Integer transform (Baseline, Main Profiles) • Transform: 8x8 Integer transform (High Profile) • Quantization: Scalar quantization • Entropy Coding : CABAC / CAVLC • In-loop deblocking filter

  13. Enhanced MC (Inter Prediction) • Every macroblock can be split in one of 7 ways for improved motion estimation • Accuracy of motion compensation = 1/4 pixel • Up to 5 reference frames for SDTV size @ L3 • Weighted predictions • Reference B pictures • Trade off between accuracy and side information

  14. B Slice - Direct Mode Direct mode • Forward / backward pair of bi-directional prediction • Prediction signal is calculated by a linear combination of two blocks that are determined by the forward and backward motion vectors pointing to two reference pictures. • Spatial Direct mode • Temporal Direct mode mvL0 = tb  mvCol / td mvL1 = – (td – tb)  mvCol / td where mvCol is a MV used in the co-located MB of the subsequent picture

  15. B Slice : Multi-picture Reference Mode Generalized Bidirectional prediction • Multiple reference pictures mode • Two forward references : proper for a region just before scene change • Two backward references : proper for a region just after scene change traditional Bidirectional

  16. H.264 Intra Prediction • 9 modes for 4x4 blocks • 4 modes for 16x16 intra prediction

  17. Luma Sub-Pixel Interpolation

  18. Chroma Sub-pel Calculation If (vx, vy) is luma vector, then xFracc = vx&0x7, yFracc = vy&0x7

  19. Block Scanning Order in a MB One more extraction of correlation among sub-blocks

  20. Transform & Quant • Integer 4x4 DCT approximation. 8x8 • Cost of transformed differences (i.e. residual coefficients) for 4x4 block using 4 x 4 Hadamard-Transformation for INTRA_16x16 coded macroblocks. • Scalar quantization. All integers! 4x4 Luma/Chroma AC 8x8 Luma-Chroma Hadamard

  21. Interlaced Coding • Deblocking filter • Frame / Field Adaptation • Picture Adaptive Frame Field (PicAFF). • Macroblock Adaptive Frame Field (MBAFF) • Field scan and zig-zag scan options Field Scan Zig-zag Frame Scan

  22. Entropy Coding • Universal Variable Length Coding (UVLC) using Exp-Golomb codes. • Context Adaptive VLC (CAVLC) • Context Adaptive Binary Arithmetic Coding (CABAC)

  23. CAVLC Zigzag order: 50 33 27 20 0 5 0 0 1 -1 0 0 0 0 0 0 • TotalCoeff = 7 : # of non-zeros • Trailing 1s = 2 : 1, -1 • Sign Trail = 1 0 (reverse order) : minus, plus • Levels = 5 20 27 33 50 (reverse order) : 7 – 2 = 5 • TotalZeros = 3 (# of zeros) • RunsBefore = 0 2 1 : 0 before -1, 2 before 1, and 0’s before 5

  24. Exp Golomb Coding

  25. Loop filter Check if the boundary is original to picture or blocking effects

  26. Profiles and Tools

  27. H.264 Profiles and Tools: Graphical Representation

  28. FRExt: Fidelity Range Extension • Lossless representation • Allows more than 8-bits per sample (upto 12-bits) • Higher resolution for color representation (4:2:2, 4:4:4) • Source editing function like alpha blending • Very high bit-rates (often with constant quality) • Very high-resolution • Color space transformation (YCgCo, YCbCr, RGB) • RGB color representation • Adaptive block transform sizes • Quantization matrices

  29. Coding Efficiency

  30. Comparision of Standards

  31. Comparision of Standards (cont’d..)

  32. References • Related group • MPEG website http://www.mpeg.org • JVT website: ftp://ftp.imtc-files.org/jvt-experts • www.mpegif.org • Test software • H.264/AVC JM Software: http://bs.hhi.de/~suehring/tml/download • Test sequences • http://ise.stanford.edu/video.html • http://kbs.cs.tu-berlin.de/~stewe/vceg/sequences.htm • http://www.its.bldrdoc.gov/vqeg • ftp.tnt.uni-hannover.de/pub/jvt/sequences/ • http://trace.eas.asu.edu/yuv/yuv.html

  33. THANKS

More Related