510 likes | 681 Views
Compression. Video Coding For Compression . . . and Beyond. Bernd Girod Information Systems Laboratory Department of Electrical Engineering Stanford University. Bit Consumption of US Households. Bit equivalent, assuming state-of-the-art compression, year 2000. CIF. QCIF.
E N D
Compression Video Coding For Compression. . .and Beyond Bernd Girod Information Systems LaboratoryDepartment of Electrical Engineering Stanford University
Bit Consumption of US Households Bit equivalent, assuming state-of-the-art compression, year 2000
CIF QCIF Desirable Compression Ratios SDTV broadcasting~2Mbps ITU-R 601 166 Mbps ~ 100 : 1 DSL ~200 kbps ~ 1,000 : 1 Dial-up modem, wireless link ~ 20 kbps ~ 10,000 : 1
Outline • Video compression – state-of-the-art • Beyond compression • Rate-scalable video • Wavelet video coding • Error-resilient video transmission • Unequal error protection • Optimal scheduling for packet networks • Distributed video coding
Outline • Video compression – state-of-the-art • Beyond compression • Rate-scalable video • Wavelet video coding • Error-resilient video transmission • Unequal error protection • Optimal scheduling for packet networks • Distributed video coding
“It has been customary in the past to transmit successive complete images of the transmitted picture.” [...] “In accordance with this invention, this difficulty is avoided by transmitting only the difference between successive images of the object.”
Coder Control Control Data Transform/Quantizer Quant.Transf. coeffs - Decoder Deq./Inv. Transform Entropy Coding 0 Motion- Compensated Predictor Intra/Inter Motion Data Motion Estimator Motion-Compensated Hybrid Coding Video in Standards: H.261, MPEG-1, MPEG-2, H.263, MPEG-4, H.264/AVC
Coder Control Control Data Transform/Quantizer Quant.Transf. coeffs - Decoder Deq./Inv. Transform Entropy Coding 0 Motion- Compensated Predictor Intra/Inter Motion Data Motion Estimator Motion-Compensated Hybrid Coding Video in ¼-pixel accuracy Standards:H.261, MPEG-1, MPEG-2, H.263,MPEG-4, H.264/AVC
Coder Control Control Data Transform/Quantizer Quant.Transf. coeffs - Decoder Deq./Inv. Transform Entropy Coding 0 Motion- Compensated Predictor Intra/Inter Adaptive block sizes . . . Motion Data Motion Estimator Motion-Compensated Hybrid Coding Video in Standards:H.261, MPEG-1,MPEG-2,H.263, MPEG-4, H.264/AVC
Coder Control Control Data Transform/Quantizer Quant.Transf. coeffs - Decoder Deq./Inv. Transform Entropy Coding 0 Motion- Compensated Predictor Intra/Inter Multiple Past Reference Frames Motion Data Motion Estimator Motion-Compensated Hybrid Coding Video in Standards:H.261, MPEG-1, MPEG-2,H.263, MPEG-4, H.264/AVC
Coder Control Control Data Transform/Quantizer Quant.Transf. coeffs - Decoder Deq./Inv. Transform Entropy Coding 0 Motion- Compensated Predictor Intra/Inter Motion Data Motion Estimator Motion-Compensated Hybrid Coding Video in Generalized B-Frames Standards:H.261, MPEG-1, MPEG-2,H.263, MPEG-4, H.264/AVC
Total bit-rate Total distortion Distortionfor block i Ratefor block i Lagrangiancostfor block i Rate-Distortion Optimized Coder Control • Minimize Lagrangian cost function • Strategy: minimize Ji for each block i separately, using a common Lagrange multiplier l
~15% Multiple Reference Frames in H.264/AVC Mobile & Calendar (CIF, 30 fps) 38 37 36 35 34 33 32 PSNR Y [dB] 31 30 29 PBB... with generalized B pictures 28 PBB... with classic B pictures PPP... with 5 previous references 27 PPP... with 1 previous reference 26 0 1 2 3 4 R [Mbit/s]
>25% Multiple Reference Frames in H.264/AVC Mobile & Calendar (CIF, 30 fps) 38 37 36 35 34 33 32 PSNR Y [dB] 31 30 29 PBB... with generalized B pictures 28 PBB... with classic B pictures PPP... with 5 previous references 27 PPP... with 1 previous reference 26 0 1 2 3 4 R [Mbit/s]
~40% Multiple Reference Frames in H.264/AVC Mobile & Calendar (CIF, 30 fps) 38 37 36 35 34 33 32 PSNR Y [dB] 31 30 29 PBB... with generalized B pictures 28 PBB... with classic B pictures PPP... with 5 previous references 27 PPP... with 1 previous reference 26 0 1 2 3 4 R [Mbit/s]
Outline • Video compression – state-of-the-art • Beyond compression • Rate-scalable video • Wavelet video coding • Error-resilient video transmission • Unequal error protection • Optimal scheduling for packet networks • Distributed video coding
Internet video streaming Surprising Success of ITU-T Rec. H.263 . . . and what is was used for. What H.263 was developed for . . . ?? Analog videophone
Internet Video Streaming Streaming client • How to accommodate heterogeneous bit-rates? • How to react to network congestion? • How to mitigate late or lost packets? DSL Media Server Internet dial-up modem wireless
Efficiency gap Enhancement layer variable bit-rate Base layer 20 kbps Fine Granular Scalability (FGS) H.264 with/without FGS option Foreman sequence (5fps) ~2dB gap
7 H H 6 5 H H H H H H H H 4 H H H H H H H H 3 2 1 0 Wavelet Video Coder Originalvideoframes LH LH LLH LLL Spatial WaveletTransform TemporalWavelet Transform Embedded Quantization & Entropy Coding • [Taubman & Zakhor,1994] [Ohm, 1994][Choi & Woods, 1999] [Hsiang & Woods, VCIP ’99] . . . and others
Low Band Even Frames Analysis: P U Motion Compensation Odd Frames High Band Low Band Even Frames Synthesis: P U [Secker & Taubman, 2001] [Popescu & Bottreau, 2001] Odd Frames High Band Lifting
MC Wavelet Coding vs. H.264/AVC 38 36 Non-scalable H.264/AVC 34 32 30 Luminance PSNR (dB) 28 26 Scalable MC 5/3 Wavelet • Sequence: Mobile CIF • H.264/AVC • high complexity RD control • CABAC • PBBPBBP . . . • 5 prev/3 future reference frames • data courtesy of M. Flierl 24 22 20 2.0 1.8 1.6 0.6 1.4 0.4 1.2 0.2 1.0 0.8 [Taubman & Secker, VCIP 2003]courtesy D. Taubman bit-rate (Mbps)
Wavelet Synthesis with Lossy Motion Vector Videoin Videoout Inverse Wavelet Transform MC Wavelet Transform Embedded Encoding Decoder Minimize J=D+lR Embedded Encoding Decoder Motion Estimator Minimize J=D+lR [Taubman & Secker, ICIP03]
40 38 Non-embedded single-rate 36 34 Video PSNR (dB) 32 Embedded wavelet coefficients Lossless motion 30 28 Embedded wavelet coefficientsLossy motion 26 CIF Foreman 24 0 200 400 600 800 1000 1200 - Bit Rate (kbps) R-D Performance with Lossy Motion Vector [Taubman & Secker, VCIP 2003]courtesy D. Taubman
Outline • Video compression – state-of-the-art • Beyond compression • Rate-scalable video • Wavelet video coding • Error-resilient video transmission • Unequal error protection • Optimal scheduling for packet networks • Distributed video coding
base layer enhancement layer packet network redundancy symbols K Reed-Solomon codeword N-K Priority Encoding Transmission (PET) information symbols … block of packets [Albanese, Blömer, Edmonds, Luby, Sudan, 1996] [Davis & Danskin, 1996] [Horn, Stuhlmuller, Link, Girod, 1999] [Puri, Ramchandran, 1999] [Mohr, Riskin, Ladner, 2000] [Stankovic, Hamzaoui, Xiong, 2002] [Chou, Wang, Padmanabhan, 2003] . . . and many more . . .
loss probability loss probability lead-time lead-time Packet Delay Jitter and Loss pdf e (1-e) loss k delay
Updated Packet Schedule Updated Packet Schedule Updated Packet Schedule Updated Packet Schedule Rate-distortion preamble Video packets Request stream Packet Schedule Smart Prefetching Idea: Send more important packets earlier to allow for more retransmissions Server Client Internet [Podolsky, McCanne, Vetterli 2000] [Miao, Ortega 2000] [Chou, Miao 2001]
Rate-Distortion Preamble I I • Each media packet n is labeled by • Bn— size [in bits] of data unit n • Ddn —distortion reduction if n is decoded • tn — decoding deadline for n I B P B P B P I B P B P B P … … …
For video: Ddn must be made “state-dependent” to accurately capture concealment Rate-Distortion Preamble I I • Each media packet n is labeled by • Bn— size [in bits] of data unit n • Ddn —distortion reduction if n is decoded • tn — decoding deadline for n I B P B P B P I B P B P B P … … …
ack: 1 ack: 1 send: 1 1 send: 1 0 0 0 0 1 0 0 1 0 send: 1 1 0 0 0 “Policy“ minimizing J = D + lR 0 1 0 0 Observation Action tcurrent+Dt tcurrent+2Dt tcurrent Markov Decision Tree for One Packet ... N transmission opportunities before deadline
~50 % R-D Optimized Streaming Performance PSNR [dB] • Foreman • 120 frames • 10 fps, I-P-P-… • H.263+ 2 Layer SNR scalable • 20 frame GOP • Copy Concealment • 20 % loss forward and back • Γ-distributed delay • κ = 10 ms • μ = 50 ms • σ = 23 ms • Pre-roll 400ms Bit-Rate [kbps]
Naive Coding Questions • To achieve graceful degradation in case of channel error for a digitally encoded signal, is an embedded signal representation (aka layers, aka data partitioning) always needed? • Can one, in general, send refinement information for an analog (i.e. uncoded) signal transmission over a noisy channel?
Side info Digital Channel Wyner- Ziv Encoder Wyner- Ziv Decoder Digitally Enhanced Analog Transmission • Forward error protection of the signal waveform • Information-theoretic bounds [Shamai, Verdu, Zamir,1998] • “Systematic lossy source-channel coding” Analog Channel (uncoded)
Wyner-Ziv Decoder A Wyner-Ziv Encoder A S* Wyner-Ziv Decoder B Wyner-Ziv Encoder B S** Forward Error Protection of Compressed Video Analog channel (uncoded) Any OldVideo Encoder Video Decoder with Error Concealment Graceful degradation without a layered signal representation S S’ Error-Prone channel [Aaron, Rane, Girod, ICIP 2003]
main S* S MPEG Encoder ED + Q-1 T-1 MC Reconstructed Frame at Encoder Channel S’ q-1 ED T-1 + MC R-S Decoder R-S Encoder MPEG Encoder coarse MPEG Encoder Side information Slepian-Wolf Encoder coarse Wyner-Ziv Encoder Wyner-Ziv MPEG Codec [Rane, Aaron, Girod, VCIP 2004]
Main Stream @ 1.092 Mbps FEC (n,k) = (40,36) FEC bitrate = 120 Kbps Total = 1.2 Mbps WZ Stream @ 270 Kbps FEP (n,k) = (52,36) WZ bitrate = 120 Kbps Total = 1.2 Mbps Graceful Degradation with Forward Error Protection
Visual Comparison of Degradation at Same PSNR Foreman 50 CIF frames @ symbol error rate = 4 x 10-4 With FEC 1 Mbps + 120 kbps (38.32 db) With FEP 1 Mbps + 120 kbps (38.78 db)
Superior Robustness of FEP Foreman 50 CIF frames @ symbol error rate = 10-3 With FEC 1 Mbps + 120 kbps (33.03 db) With FEP 1 Mbps + 120 kbps (38.40 db)
Lossy Compression with Side Information Source Decoder Encoder [Wyner, Ziv, 1976] For mse distortion and Gaussian statistics, rate-distortion functions of the two systems are the same. Source Decoder Encoder
Interframe Decoder Intraframe Encoder WZ frames Slepian-Wolf Codec X’ Reconstruction Turbo Decoder Turbo Encoder X Scalar Quantizer Buffer Request bits Y Key frames K’ Conventional Intraframe decoding Interpolation/ Extrapolation Conventional Intraframe coding K [Aaron, Zhang, Girod, Asilomar 2002] [Aaron, Rane, Zhang, Girod, DCC 2003] Ultra-Low-Complexity Video Coding
3 dB 8 dB R-D Performance Ultra-Low-Complexity Video Coder • Sequence: Foreman • WZ frames - even frames • Key frames - odd frames • Side information - motion compensated interpolation of key frames
Ultra-Low-Complexity Video Coder Wyner-Ziv Codec 274 kbps, 39.0 dB H263+ Intraframe Coding 330 kbps, 32.9 dB
Ultra-Low-Complexity Video Coder Wyner-Ziv Codec 274 kbps, 39.0 dB H263+ I-B-I-B 276 kbps, 41.8 dB
Stanford Camera Array Courtesy Marc Levoy, Stanford Computer Graphics Lab
Stanford Camera Array Courtesy Marc Levoy, Stanford Computer Graphics Lab
Light Field Compression Wyner-Ziv, Pixel-Domain JPEG-2000 Rate: 0.11 bpp PSNR 39.9 dB Rate: 0.11 bpp PSNR 37.4 dB
Conclusions • Video compression is very important. . . but there is more to video coding than compression • Rate-scalable video representations: mc lifting break-through • Robust video transmission • Virtual priority mechanisms by packet scheduling • RD gains easily larger than from super-clever compression • Distributed video coding: radically different approach • Graceful degradation w/o layers • Ultra-low-complexity coders • UbiquitousJ=D+lR
Acknowledgments Anne M. Aaron Jacob Chakareski Philip A. Chou J=D+lR Markus Flierl Sang-eun Han Mark Kalman Marc Levoy Yi Liang Shantanu Rane David Rebollo-Monedero Andrew Secker David Taubman Thomas Wiegand Xiaoqing Zhu Rui Zhang
Progress is a wonderful thing,if only it would stop . . . Robert Musil