Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Overview of the Scalable Video Coding Extension of the H.264/AVC Standard Kianoosh Mokhtarian School of Computing Science Simon Fraser University 6/24/2007

Motivation • High heterogeneity among receivers • Connection quality • Display resolution • Processing power

Motivation • High heterogeneity among receivers • Connection quality • Display resolution • Processing power • Simulcasting

Motivation • High heterogeneity among receivers • Connection quality • Display resolution • Processing power • Simulcasting • Transcoding

Motivation • High heterogeneity among receivers • Connection quality • Display resolution • Processing power • Simulcasting • Transcoding • Scalability • H.262|MPEG-2, H.263, MPEG-4 Visual

Overview • Background • Temporal scalability • Spatial scalability • Quality scalability • Conclusion

Background • Scalability • Temporal • Spatial • Quality (fidelity or SNR) • Object-based and region-of-interest • Hybrid

Background • Scalability • Temporal • Spatial • Quality (fidelity or SNR) • Object-based and region-of-interest • Hybrid • Applications • Encode once, decode many ways • Unequal importance + unequal error protection • Archiving in surveillance applications

Background • Requirements for a scalable video coding technique • Similar coding efficiency to single-layer coding • Little increase in decoding complexity • Support of temporal, spatial, quality scalability • Backward compatibility of the base layer • Support of simple bitstream adaptations after encoding

Temporal Scalability • Enabled by restricting motion-compensated prediction

Temporal Scalability • Enabled by restricting motion-compensated prediction • Already provided by H.264/AVC

Temporal Scalability • Enabled by restricting motion-compensated prediction • Already provided by H.264/AVC • Hierarchical prediction structure • Pictures of temporal enhancement layers: typically B-pictures • Group of Pictures (GoP)

Temporal Scalability: Hierarchical Pred’ Struct’ • Dyadic temporal enhancement layers

Temporal Scalability: Hierarchical Pred’ Struct’ • Non-dyadic case

Temporal Scalability: Hierarchical Pred’ Struct’ • Other flexibilities • Multiple reference picture concept of H.264/AVC • Reference picture can be in the same layer as the target frame • Hierarchical prediction structure can be modified over time

Temporal Scalability: Hierarchical Pred’ Struct’ • Adjusting the structural delay

Temporal Scalability: Coding Efficiency • Highly dependent on quantization parameters • Intuitively, higher fidelity for the temporal base layer pictures • How to choose QPs • Expensive rate-distortion analysis • QPT = QP0 + 3 + T • High PSNR fluctuations inside a GoP • Subjectively shown to be temporally smooth

Temporal Scalability: Coding Efficiency • Dyadic hierarchical B-pictures, no delay constraint

Temporal Scalability: Coding Efficiency • High-delay test set, CIF 30Hz, 34dB, compared to IPPP

Temporal Scalability: Coding Efficiency • Low-delay test set, 365x288, 25-30Hz, 38dB, delay is constrained to be zero compared to IPPP

Temporal Scalability: Conclusion • Typically no negative impact on coding efficiency • But also significant improvement, especially when higher delays are tolerable • Minor losses in coding efficiency are possible when low delay is required

Spatial Scalability • Motion-compensated prediction and intra-prediction in each spatial layer, as for single-layer coding

Spatial Scalability • Motion-compensated prediction and intra-prediction in each spatial layer, as for single-layer coding • Inter-layer prediction

Spatial Scalability • Motion-compensated prediction and intra-prediction in each spatial layer, as for single-layer coding • Inter-layer prediction • Same coding order for all layers

Spatial Scalability • Motion-compensated prediction and intra-prediction in each spatial layer, as for single-layer coding • Inter-layer prediction • Same coding order for all layers • Access units

Spatial Scalability: Inter-Layer Prediction • Previous standards • Inter-layer prediction by upsampling the reconstructed samples of the lower layer signal • Prediction signal formed by: • Upsampled lower layer signal • Temporal prediction inside the enhancement layer • Averaging both

Spatial Scalability: Inter-Layer Prediction • Previous standards • Inter-layer prediction by upsampling the reconstructed samples of the lower layer signal • Prediction signal formed by: • Upsampled lower layer signal • Temporal prediction inside the enhancement layer • Averaging both • Lower layer samples not necessarily the most suitable data for inter-layer prediction

Spatial Scalability: Inter-Layer Prediction • Previous standards • Inter-layer prediction by upsampling the reconstructed samples of the lower layer signal • Prediction signal formed by: • Upsampled lower layer signal • Temporal prediction inside the enhancement layer • Averaging both • Lower layer samples not necessarily the most suitable data for inter-layer prediction • Prediction of macroblock modes and associated motion parameters • Prediction of the residual signal

Spatial Scalability: Inter-Layer Prediction • A new macroblock type signalled by base mode flag • Only a residual signal is transmitted • No intra-prediction mode or motion parameter

Spatial Scalability: Inter-Layer Prediction • A new macroblock type signalled by base mode flag • Only a residual signal is transmitted • No intra-prediction mode or motion parameter • If the corresponding block in the reference layer is: • Intra-coded  inter-layer intra prediction • The reconstructed intra-signal of the reference layer is upsampled as a predictor • Inter-coded  inter-layer motion prediction • Partitioning data are upsampled, reference indexes are copied, and motion vectors are scaled up

Spatial Scalability: Inter-Layer Prediction • Inter-layer motion prediction (for a 16x16, 16x8, 8x16, or 8x8 macroblock partition) • Reference indexes are copied • Scaled motion vectors are used as motion vector predictors

Spatial Scalability: Inter-Layer Prediction • Inter-layer motion prediction (for a 16x16, 16x8, 8x16, or 8x8 macroblock partition) • Reference indexes are copied • Scaled motion vectors are used as motion vector predictors • Inter-layer residual prediction • Can be used for any inter-coded macroblock, regardless of its base mode flag or inter-layer motion prediction • The residual signal of the reference layer is upsampled as a predictor

Spatial Scalability: Inter-Layer Prediction For a 16x16 macroblock in an enhancement layer: Inter-layer intra prediction (samples values are predicted) 1 Inter-layer residual prediction Inter-layer motion prediction (partitioning data, ref. indexes, and motion vectors are derived) base mode flag Inter-layer motion prediction (ref. indexes are derived, motion vectors are predicted) No inter-layer residual prediction 0 No inter-layer motion prediction

Spatial Scalability: Generalizing • Not only dyadic • Enhancement layer may represent only a selected rectangular area of its reference layer picture • Enhancement layer may contain additional parts beyond the borders of its reference layer picture • Tools for spatial scalable coding of interlaced sources

Spatial Scalability: Complexity Constraints • Inter-layer intra-prediction is restricted • Although coding efficiency is improved by generally allowing this prediction mode • Each layer can be decoded by a single motion compensation loop, unlike previous coding standards

Spatial Scalability: Coding Efficiency • Comparison to single-layer coding and simulcast • Base/enhancement layer at 352x288 / 704x576 • Only the first frame is intra-coded • Inter-layer prediction (ILP): • Intra (I), motion (M), residual (R)

Spatial Scalability: Coding Efficiency • Comparison of fully featured SVC “single-loop ILP (I, M, R)” to scalable profiles of previous standards “multi-loop ILP (I)”

Spatial Scalability: Encoder Control • JSVM software encoder control • Base layer coding parameters are optimized for that layer only  performance equal to single-layer H.264/AVC

Spatial Scalability: Encoder Control • JSVM software encoder control • Base layer coding parameters are optimized for that layer only  performance equal to single-layer H.264/AVC • Not necessarily suitable for an efficient enhancement layer coding

Spatial Scalability: Encoder Control • JSVM software encoder control • Base layer coding parameters are optimized for that layer only  performance equal to single-layer H.264/AVC • Not necessarily suitable for an efficient enhancement layer coding • Improved multi-layer encoder control • Optimized for both layers

Spatial Scalability: Encoder Control • QPenhancement layer = QPbase layer + 4 • Hierarchical B-pictures, GoP size = 16 • Bit-rate increase relative to single-layer for the same quality is always less than or equal to 10% for both layers

Quality Scalability • Special case of spatial scalability with identical picture sizes • No upsampling for inter-layer predictions • Inter-layer intra- and residual-prediction are directly performed in transform domain • Different qualities achieved by decreasing quantization step along the layers • Coarse-Grained Scalability (CGS) • A few selected bitrates are supported in the scalable bitstream • Quality scalability becomes less efficient when bitrate difference between CGS layers gets smaller

Quality Scalability: MGS • Medium-Grained Scalability (MGS) improves: • Flexibility of the stream • Packet-level quality scalability • Error robustness • Controlling drift propagation • Coding efficiency • Use of more information for temporal prediction

Quality Scalability: MGS • MGS: flexibility of the stream • Enhancement layer transform coefficients can be distributed among several slices • Packet-level quality scalability

Quality Scalability: MGS • MGS: error robustness vs. coding efficiency

Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Overview of the Scalable Video Coding Extension of the H.264/AVC Standard

Presentation Transcript

Overview of the H.264/AVC Video Coding Standard

Motion Compensated Prediction and the Role of the DCT in Video Coding

Audio Video coding Standard of (AVS) China

Live P2P Streaming with Scalable Video Coding and Network Coding Shabnam Mirshokraie, Mohamed Hefeeda

G.J. Sullivan, J.R. Ohm, W.J. Han, and T. Wiegand

Multiplexing of Variable Bitrate Scalable Video for Mobile Broadcast Networks

STANDARD METRICS Impressions Clicks Click Through Rate Interactions (Pos / Neg)

MPEG Video Coding II — MPEG-4, 7 and Beyond

Existing Video Coding Standards

Combined scalability coding based on the scalable extension of H.264/AVC

High Efficiency Video Coding

Scalable Video Coding with Wavelet-Based Approaches

Context-based Adaptive Coding and the Emerging H.26L Video Compression Standard

Cross-Layer Error Resilient Mechanism in Scalable Video Coding

Fine Grained Scalable Video Coding For Streaming

An Introduction to H.264/AVC and 3D Video Coding

Video coding

Fully Scalable Multiview Wavelet Video Coding

Overview of the H. 264/AVC video coding standard

Fast Lossless Multi-Resolution Motion Estimation for Scalable Wavelet Video Coding

Wyner-Ziv Coding of Motion Video

Scalable Video Coding