1 / 51

Video Coding For Compression . . . and Beyond

Compression. Video Coding For Compression . . . and Beyond. Bernd Girod Information Systems Laboratory Department of Electrical Engineering Stanford University. Bit Consumption of US Households. Bit equivalent, assuming state-of-the-art compression, year 2000. CIF. QCIF.

ferris
Download Presentation

Video Coding For Compression . . . and Beyond

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compression Video Coding For Compression. . .and Beyond Bernd Girod Information Systems LaboratoryDepartment of Electrical Engineering Stanford University

  2. Bit Consumption of US Households Bit equivalent, assuming state-of-the-art compression, year 2000

  3. CIF QCIF Desirable Compression Ratios SDTV broadcasting~2Mbps ITU-R 601 166 Mbps ~ 100 : 1 DSL ~200 kbps ~ 1,000 : 1 Dial-up modem, wireless link ~ 20 kbps ~ 10,000 : 1

  4. Outline • Video compression – state-of-the-art • Beyond compression • Rate-scalable video • Wavelet video coding • Error-resilient video transmission • Unequal error protection • Optimal scheduling for packet networks • Distributed video coding

  5. Outline • Video compression – state-of-the-art • Beyond compression • Rate-scalable video • Wavelet video coding • Error-resilient video transmission • Unequal error protection • Optimal scheduling for packet networks • Distributed video coding

  6. “It has been customary in the past to transmit successive complete images of the transmitted picture.” [...] “In accordance with this invention, this difficulty is avoided by transmitting only the difference between successive images of the object.”

  7. Coder Control Control Data Transform/Quantizer Quant.Transf. coeffs - Decoder Deq./Inv. Transform Entropy Coding 0 Motion- Compensated Predictor Intra/Inter Motion Data Motion Estimator Motion-Compensated Hybrid Coding Video in Standards: H.261, MPEG-1, MPEG-2, H.263, MPEG-4, H.264/AVC

  8. Coder Control Control Data Transform/Quantizer Quant.Transf. coeffs - Decoder Deq./Inv. Transform Entropy Coding 0 Motion- Compensated Predictor Intra/Inter Motion Data Motion Estimator Motion-Compensated Hybrid Coding Video in ¼-pixel accuracy Standards:H.261, MPEG-1, MPEG-2, H.263,MPEG-4, H.264/AVC

  9. Coder Control Control Data Transform/Quantizer Quant.Transf. coeffs - Decoder Deq./Inv. Transform Entropy Coding 0 Motion- Compensated Predictor Intra/Inter Adaptive block sizes . . . Motion Data Motion Estimator Motion-Compensated Hybrid Coding Video in Standards:H.261, MPEG-1,MPEG-2,H.263, MPEG-4, H.264/AVC

  10. Coder Control Control Data Transform/Quantizer Quant.Transf. coeffs - Decoder Deq./Inv. Transform Entropy Coding 0 Motion- Compensated Predictor Intra/Inter Multiple Past Reference Frames Motion Data Motion Estimator Motion-Compensated Hybrid Coding Video in Standards:H.261, MPEG-1, MPEG-2,H.263, MPEG-4, H.264/AVC

  11. Coder Control Control Data Transform/Quantizer Quant.Transf. coeffs - Decoder Deq./Inv. Transform Entropy Coding 0 Motion- Compensated Predictor Intra/Inter Motion Data Motion Estimator Motion-Compensated Hybrid Coding Video in Generalized B-Frames Standards:H.261, MPEG-1, MPEG-2,H.263, MPEG-4, H.264/AVC

  12. Total bit-rate Total distortion Distortionfor block i Ratefor block i Lagrangiancostfor block i Rate-Distortion Optimized Coder Control • Minimize Lagrangian cost function • Strategy: minimize Ji for each block i separately, using a common Lagrange multiplier l

  13. ~15% Multiple Reference Frames in H.264/AVC Mobile & Calendar (CIF, 30 fps) 38 37 36 35 34 33 32 PSNR Y [dB] 31 30 29 PBB... with generalized B pictures 28 PBB... with classic B pictures PPP... with 5 previous references 27 PPP... with 1 previous reference 26 0 1 2 3 4 R [Mbit/s]

  14. >25% Multiple Reference Frames in H.264/AVC Mobile & Calendar (CIF, 30 fps) 38 37 36 35 34 33 32 PSNR Y [dB] 31 30 29 PBB... with generalized B pictures 28 PBB... with classic B pictures PPP... with 5 previous references 27 PPP... with 1 previous reference 26 0 1 2 3 4 R [Mbit/s]

  15. ~40% Multiple Reference Frames in H.264/AVC Mobile & Calendar (CIF, 30 fps) 38 37 36 35 34 33 32 PSNR Y [dB] 31 30 29 PBB... with generalized B pictures 28 PBB... with classic B pictures PPP... with 5 previous references 27 PPP... with 1 previous reference 26 0 1 2 3 4 R [Mbit/s]

  16. Outline • Video compression – state-of-the-art • Beyond compression • Rate-scalable video • Wavelet video coding • Error-resilient video transmission • Unequal error protection • Optimal scheduling for packet networks • Distributed video coding

  17. Internet video streaming Surprising Success of ITU-T Rec. H.263 . . . and what is was used for. What H.263 was developed for . . . ?? Analog videophone

  18. Internet Video Streaming Streaming client • How to accommodate heterogeneous bit-rates? • How to react to network congestion? • How to mitigate late or lost packets? DSL Media Server Internet dial-up modem wireless

  19. Efficiency gap Enhancement layer variable bit-rate Base layer 20 kbps Fine Granular Scalability (FGS) H.264 with/without FGS option Foreman sequence (5fps) ~2dB gap

  20. 7 H H 6 5 H H H H H H H H 4 H H H H H H H H 3 2 1 0 Wavelet Video Coder Originalvideoframes LH LH LLH LLL Spatial WaveletTransform TemporalWavelet Transform Embedded Quantization & Entropy Coding • [Taubman & Zakhor,1994] [Ohm, 1994][Choi & Woods, 1999] [Hsiang & Woods, VCIP ’99] . . . and others

  21. Low Band Even Frames Analysis: P U Motion Compensation Odd Frames High Band Low Band Even Frames Synthesis: P U [Secker & Taubman, 2001] [Popescu & Bottreau, 2001] Odd Frames High Band Lifting

  22. MC Wavelet Coding vs. H.264/AVC 38 36 Non-scalable H.264/AVC 34 32 30 Luminance PSNR (dB) 28 26 Scalable MC 5/3 Wavelet • Sequence: Mobile CIF • H.264/AVC • high complexity RD control • CABAC • PBBPBBP . . . • 5 prev/3 future reference frames • data courtesy of M. Flierl 24 22 20 2.0 1.8 1.6 0.6 1.4 0.4 1.2 0.2 1.0 0.8 [Taubman & Secker, VCIP 2003]courtesy D. Taubman bit-rate (Mbps)

  23. Wavelet Synthesis with Lossy Motion Vector Videoin Videoout Inverse Wavelet Transform MC Wavelet Transform Embedded Encoding Decoder Minimize J=D+lR Embedded Encoding Decoder Motion Estimator Minimize J=D+lR [Taubman & Secker, ICIP03]

  24. 40 38 Non-embedded single-rate 36 34 Video PSNR (dB) 32 Embedded wavelet coefficients Lossless motion 30 28 Embedded wavelet coefficientsLossy motion 26 CIF Foreman 24 0 200 400 600 800 1000 1200 - Bit Rate (kbps) R-D Performance with Lossy Motion Vector [Taubman & Secker, VCIP 2003]courtesy D. Taubman

  25. Outline • Video compression – state-of-the-art • Beyond compression • Rate-scalable video • Wavelet video coding • Error-resilient video transmission • Unequal error protection • Optimal scheduling for packet networks • Distributed video coding

  26. base layer enhancement layer packet network redundancy symbols K Reed-Solomon codeword N-K Priority Encoding Transmission (PET) information symbols … block of packets [Albanese, Blömer, Edmonds, Luby, Sudan, 1996] [Davis & Danskin, 1996] [Horn, Stuhlmuller, Link, Girod, 1999] [Puri, Ramchandran, 1999] [Mohr, Riskin, Ladner, 2000] [Stankovic, Hamzaoui, Xiong, 2002] [Chou, Wang, Padmanabhan, 2003] . . . and many more . . .

  27. loss probability loss probability lead-time lead-time Packet Delay Jitter and Loss pdf e (1-e) loss k  delay

  28. Updated Packet Schedule Updated Packet Schedule Updated Packet Schedule Updated Packet Schedule Rate-distortion preamble Video packets Request stream Packet Schedule Smart Prefetching Idea: Send more important packets earlier to allow for more retransmissions Server Client Internet [Podolsky, McCanne, Vetterli 2000] [Miao, Ortega 2000] [Chou, Miao 2001]

  29. Rate-Distortion Preamble I I • Each media packet n is labeled by • Bn— size [in bits] of data unit n • Ddn —distortion reduction if n is decoded • tn — decoding deadline for n I B P B P B P I B P B P B P … … …

  30. For video: Ddn must be made “state-dependent” to accurately capture concealment Rate-Distortion Preamble I I • Each media packet n is labeled by • Bn— size [in bits] of data unit n • Ddn —distortion reduction if n is decoded • tn — decoding deadline for n I B P B P B P I B P B P B P … … …

  31. ack: 1 ack: 1 send: 1 1 send: 1 0 0 0 0 1 0 0 1 0 send: 1 1 0 0 0 “Policy“ minimizing J = D + lR 0 1 0 0 Observation Action tcurrent+Dt tcurrent+2Dt tcurrent Markov Decision Tree for One Packet ... N transmission opportunities before deadline

  32. ~50 % R-D Optimized Streaming Performance PSNR [dB] • Foreman • 120 frames • 10 fps, I-P-P-… • H.263+ 2 Layer SNR scalable • 20 frame GOP • Copy Concealment • 20 % loss forward and back • Γ-distributed delay • κ = 10 ms • μ = 50 ms • σ = 23 ms • Pre-roll 400ms Bit-Rate [kbps]

  33. Naive Coding Questions • To achieve graceful degradation in case of channel error for a digitally encoded signal, is an embedded signal representation (aka layers, aka data partitioning) always needed? • Can one, in general, send refinement information for an analog (i.e. uncoded) signal transmission over a noisy channel?

  34. Side info Digital Channel Wyner- Ziv Encoder Wyner- Ziv Decoder Digitally Enhanced Analog Transmission • Forward error protection of the signal waveform • Information-theoretic bounds [Shamai, Verdu, Zamir,1998] • “Systematic lossy source-channel coding” Analog Channel (uncoded)

  35. Wyner-Ziv Decoder A Wyner-Ziv Encoder A S* Wyner-Ziv Decoder B Wyner-Ziv Encoder B S** Forward Error Protection of Compressed Video Analog channel (uncoded) Any OldVideo Encoder Video Decoder with Error Concealment Graceful degradation without a layered signal representation S S’ Error-Prone channel [Aaron, Rane, Girod, ICIP 2003]

  36. main S* S MPEG Encoder ED + Q-1 T-1 MC Reconstructed Frame at Encoder Channel S’ q-1 ED T-1 + MC R-S Decoder R-S Encoder MPEG Encoder coarse MPEG Encoder Side information Slepian-Wolf Encoder coarse Wyner-Ziv Encoder Wyner-Ziv MPEG Codec [Rane, Aaron, Girod, VCIP 2004]

  37. Main Stream @ 1.092 Mbps FEC (n,k) = (40,36) FEC bitrate = 120 Kbps Total = 1.2 Mbps WZ Stream @ 270 Kbps FEP (n,k) = (52,36) WZ bitrate = 120 Kbps Total = 1.2 Mbps Graceful Degradation with Forward Error Protection

  38. Visual Comparison of Degradation at Same PSNR Foreman 50 CIF frames @ symbol error rate = 4 x 10-4 With FEC 1 Mbps + 120 kbps (38.32 db) With FEP 1 Mbps + 120 kbps (38.78 db)

  39. Superior Robustness of FEP Foreman 50 CIF frames @ symbol error rate = 10-3 With FEC 1 Mbps + 120 kbps (33.03 db) With FEP 1 Mbps + 120 kbps (38.40 db)

  40. Lossy Compression with Side Information Source Decoder Encoder [Wyner, Ziv, 1976] For mse distortion and Gaussian statistics, rate-distortion functions of the two systems are the same. Source Decoder Encoder

  41. Interframe Decoder Intraframe Encoder WZ frames Slepian-Wolf Codec X’ Reconstruction Turbo Decoder Turbo Encoder X Scalar Quantizer Buffer Request bits Y Key frames K’ Conventional Intraframe decoding Interpolation/ Extrapolation Conventional Intraframe coding K [Aaron, Zhang, Girod, Asilomar 2002] [Aaron, Rane, Zhang, Girod, DCC 2003] Ultra-Low-Complexity Video Coding

  42. 3 dB 8 dB R-D Performance Ultra-Low-Complexity Video Coder • Sequence: Foreman • WZ frames - even frames • Key frames - odd frames • Side information - motion compensated interpolation of key frames

  43. Ultra-Low-Complexity Video Coder Wyner-Ziv Codec 274 kbps, 39.0 dB H263+ Intraframe Coding 330 kbps, 32.9 dB

  44. Ultra-Low-Complexity Video Coder Wyner-Ziv Codec 274 kbps, 39.0 dB H263+ I-B-I-B 276 kbps, 41.8 dB

  45. Stanford Camera Array Courtesy Marc Levoy, Stanford Computer Graphics Lab

  46. Stanford Camera Array Courtesy Marc Levoy, Stanford Computer Graphics Lab

  47. Light Field Compression Wyner-Ziv, Pixel-Domain JPEG-2000 Rate: 0.11 bpp PSNR 39.9 dB Rate: 0.11 bpp PSNR 37.4 dB

  48. Conclusions • Video compression is very important. . . but there is more to video coding than compression • Rate-scalable video representations: mc lifting break-through • Robust video transmission • Virtual priority mechanisms by packet scheduling • RD gains easily larger than from super-clever compression • Distributed video coding: radically different approach • Graceful degradation w/o layers • Ultra-low-complexity coders • UbiquitousJ=D+lR

  49. Acknowledgments Anne M. Aaron Jacob Chakareski Philip A. Chou J=D+lR Markus Flierl Sang-eun Han Mark Kalman Marc Levoy Yi Liang Shantanu Rane David Rebollo-Monedero Andrew Secker David Taubman Thomas Wiegand Xiaoqing Zhu Rui Zhang

  50. Progress is a wonderful thing,if only it would stop . . . Robert Musil

More Related