580 likes | 591 Views
This overview explores the scalability features of H.264/AVC, including temporal, spatial, and quality scalability, and its applications in various scenarios. The focus is on achieving coding efficiency and backward compatibility.
E N D
Overview of the Scalable Video Coding Extension of the H.264/AVC Standard Kianoosh Mokhtarian School of Computing Science Simon Fraser University 6/24/2007
Motivation • High heterogeneity among receivers • Connection quality • Display resolution • Processing power
Motivation • High heterogeneity among receivers • Connection quality • Display resolution • Processing power • Simulcasting
Motivation • High heterogeneity among receivers • Connection quality • Display resolution • Processing power • Simulcasting • Transcoding
Motivation • High heterogeneity among receivers • Connection quality • Display resolution • Processing power • Simulcasting • Transcoding • Scalability • H.262|MPEG-2, H.263, MPEG-4 Visual
Overview • Background • Temporal scalability • Spatial scalability • Quality scalability • Conclusion
Background • Scalability • Temporal • Spatial • Quality (fidelity or SNR) • Object-based and region-of-interest • Hybrid
Background • Scalability • Temporal • Spatial • Quality (fidelity or SNR) • Object-based and region-of-interest • Hybrid • Applications • Encode once, decode many ways • Unequal importance + unequal error protection • Archiving in surveillance applications
Background • Requirements for a scalable video coding technique • Similar coding efficiency to single-layer coding • Little increase in decoding complexity • Support of temporal, spatial, quality scalability • Backward compatibility of the base layer • Support of simple bitstream adaptations after encoding
Overview • Background • Temporal scalability • Spatial scalability • Quality scalability • Conclusion
Temporal Scalability • Enabled by restricting motion-compensated prediction
Temporal Scalability • Enabled by restricting motion-compensated prediction • Already provided by H.264/AVC
Temporal Scalability • Enabled by restricting motion-compensated prediction • Already provided by H.264/AVC • Hierarchical prediction structure • Pictures of temporal enhancement layers: typically B-pictures • Group of Pictures (GoP)
Temporal Scalability: Hierarchical Pred’ Struct’ • Dyadic temporal enhancement layers
Temporal Scalability: Hierarchical Pred’ Struct’ • Non-dyadic case
Temporal Scalability: Hierarchical Pred’ Struct’ • Other flexibilities • Multiple reference picture concept of H.264/AVC • Reference picture can be in the same layer as the target frame • Hierarchical prediction structure can be modified over time
Temporal Scalability: Hierarchical Pred’ Struct’ • Adjusting the structural delay
Temporal Scalability: Coding Efficiency • Highly dependent on quantization parameters • Intuitively, higher fidelity for the temporal base layer pictures • How to choose QPs • Expensive rate-distortion analysis • QPT = QP0 + 3 + T • High PSNR fluctuations inside a GoP • Subjectively shown to be temporally smooth
Temporal Scalability: Coding Efficiency • Dyadic hierarchical B-pictures, no delay constraint
Temporal Scalability: Coding Efficiency • High-delay test set, CIF 30Hz, 34dB, compared to IPPP
Temporal Scalability: Coding Efficiency • Low-delay test set, 365x288, 25-30Hz, 38dB, delay is constrained to be zero compared to IPPP
Temporal Scalability: Conclusion • Typically no negative impact on coding efficiency • But also significant improvement, especially when higher delays are tolerable • Minor losses in coding efficiency are possible when low delay is required
Overview • Background • Temporal scalability • Spatial scalability • Quality scalability • Conclusion
Spatial Scalability • Motion-compensated prediction and intra-prediction in each spatial layer, as for single-layer coding
Spatial Scalability • Motion-compensated prediction and intra-prediction in each spatial layer, as for single-layer coding • Inter-layer prediction
Spatial Scalability • Motion-compensated prediction and intra-prediction in each spatial layer, as for single-layer coding • Inter-layer prediction • Same coding order for all layers
Spatial Scalability • Motion-compensated prediction and intra-prediction in each spatial layer, as for single-layer coding • Inter-layer prediction • Same coding order for all layers • Access units
Spatial Scalability: Inter-Layer Prediction • Previous standards • Inter-layer prediction by upsampling the reconstructed samples of the lower layer signal • Prediction signal formed by: • Upsampled lower layer signal • Temporal prediction inside the enhancement layer • Averaging both
Spatial Scalability: Inter-Layer Prediction • Previous standards • Inter-layer prediction by upsampling the reconstructed samples of the lower layer signal • Prediction signal formed by: • Upsampled lower layer signal • Temporal prediction inside the enhancement layer • Averaging both • Lower layer samples not necessarily the most suitable data for inter-layer prediction
Spatial Scalability: Inter-Layer Prediction • Previous standards • Inter-layer prediction by upsampling the reconstructed samples of the lower layer signal • Prediction signal formed by: • Upsampled lower layer signal • Temporal prediction inside the enhancement layer • Averaging both • Lower layer samples not necessarily the most suitable data for inter-layer prediction • Prediction of macroblock modes and associated motion parameters • Prediction of the residual signal
Spatial Scalability: Inter-Layer Prediction • A new macroblock type signalled by base mode flag • Only a residual signal is transmitted • No intra-prediction mode or motion parameter
Spatial Scalability: Inter-Layer Prediction • A new macroblock type signalled by base mode flag • Only a residual signal is transmitted • No intra-prediction mode or motion parameter • If the corresponding block in the reference layer is: • Intra-coded inter-layer intra prediction • The reconstructed intra-signal of the reference layer is upsampled as a predictor • Inter-coded inter-layer motion prediction • Partitioning data are upsampled, reference indexes are copied, and motion vectors are scaled up
Spatial Scalability: Inter-Layer Prediction • Inter-layer motion prediction (for a 16x16, 16x8, 8x16, or 8x8 macroblock partition) • Reference indexes are copied • Scaled motion vectors are used as motion vector predictors
Spatial Scalability: Inter-Layer Prediction • Inter-layer motion prediction (for a 16x16, 16x8, 8x16, or 8x8 macroblock partition) • Reference indexes are copied • Scaled motion vectors are used as motion vector predictors • Inter-layer residual prediction • Can be used for any inter-coded macroblock, regardless of its base mode flag or inter-layer motion prediction • The residual signal of the reference layer is upsampled as a predictor
Spatial Scalability: Inter-Layer Prediction For a 16x16 macroblock in an enhancement layer: Inter-layer intra prediction (samples values are predicted) 1 Inter-layer residual prediction Inter-layer motion prediction (partitioning data, ref. indexes, and motion vectors are derived) base mode flag Inter-layer motion prediction (ref. indexes are derived, motion vectors are predicted) No inter-layer residual prediction 0 No inter-layer motion prediction
Spatial Scalability: Generalizing • Not only dyadic • Enhancement layer may represent only a selected rectangular area of its reference layer picture • Enhancement layer may contain additional parts beyond the borders of its reference layer picture • Tools for spatial scalable coding of interlaced sources
Spatial Scalability: Complexity Constraints • Inter-layer intra-prediction is restricted • Although coding efficiency is improved by generally allowing this prediction mode • Each layer can be decoded by a single motion compensation loop, unlike previous coding standards
Spatial Scalability: Coding Efficiency • Comparison to single-layer coding and simulcast • Base/enhancement layer at 352x288 / 704x576 • Only the first frame is intra-coded • Inter-layer prediction (ILP): • Intra (I), motion (M), residual (R)
Spatial Scalability: Coding Efficiency • Comparison to single-layer coding and simulcast • Base/enhancement layer at 352x288 / 704x576 • Only the first frame is intra-coded • Inter-layer prediction (ILP): • Intra (I), motion (M), residual (R)
Spatial Scalability: Coding Efficiency • Comparison to single-layer coding and simulcast • Base/enhancement layer at 352x288 / 704x576 • Only the first frame is intra-coded • Inter-layer prediction (ILP): • Intra (I), motion (M), residual (R)
Spatial Scalability: Coding Efficiency • Comparison of fully featured SVC “single-loop ILP (I, M, R)” to scalable profiles of previous standards “multi-loop ILP (I)”
Spatial Scalability: Encoder Control • JSVM software encoder control • Base layer coding parameters are optimized for that layer only performance equal to single-layer H.264/AVC
Spatial Scalability: Encoder Control • JSVM software encoder control • Base layer coding parameters are optimized for that layer only performance equal to single-layer H.264/AVC • Not necessarily suitable for an efficient enhancement layer coding
Spatial Scalability: Encoder Control • JSVM software encoder control • Base layer coding parameters are optimized for that layer only performance equal to single-layer H.264/AVC • Not necessarily suitable for an efficient enhancement layer coding • Improved multi-layer encoder control • Optimized for both layers
Spatial Scalability: Encoder Control • QPenhancement layer = QPbase layer + 4 • Hierarchical B-pictures, GoP size = 16 • Bit-rate increase relative to single-layer for the same quality is always less than or equal to 10% for both layers
Overview • Background • Temporal scalability • Spatial scalability • Quality scalability • Conclusion
Quality Scalability • Special case of spatial scalability with identical picture sizes • No upsampling for inter-layer predictions • Inter-layer intra- and residual-prediction are directly performed in transform domain • Different qualities achieved by decreasing quantization step along the layers • Coarse-Grained Scalability (CGS) • A few selected bitrates are supported in the scalable bitstream • Quality scalability becomes less efficient when bitrate difference between CGS layers gets smaller
Quality Scalability: MGS • Medium-Grained Scalability (MGS) improves: • Flexibility of the stream • Packet-level quality scalability • Error robustness • Controlling drift propagation • Coding efficiency • Use of more information for temporal prediction
Quality Scalability: MGS • MGS: flexibility of the stream • Enhancement layer transform coefficients can be distributed among several slices • Packet-level quality scalability
Quality Scalability: MGS • MGS: error robustness vs. coding efficiency