570 likes | 593 Views
Codierformate für Bilder und Video Ralf Schäfer Fraunhofer Heinrich-Hertz-Institut schaefer@hhi.de http://ip.hhi.de. Outline. Introduction Some fundamentals in image coding JPEG-2000 and Motion JPEG-2000: Compression tools for production MPEG-4: New functionalities for interactivity
E N D
CodierformatefürBilder und Video Ralf Schäfer Fraunhofer Heinrich-Hertz-Institut schaefer@hhi.de http://ip.hhi.de Ralf Schäfer
Outline • Introduction • Some fundamentals in image coding • JPEG-2000 and Motion JPEG-2000: Compression tools for production • MPEG-4: New functionalities for interactivity • H.264/AVC: A step forward in compression technology • Experimental Results and Comparison of MPEG-2 and H.264/AVC • Scalable coding • Multiview coding • Next generation video coding • Conclusions Ralf Schäfer
Lossless or „quasi“ lossless Archive Computer animation Storage Media Media Encoder Media Encoder lossy UNICAST, MULTICAST, BROADCAST Live Content Post production Transmission Recorded Content Compression as enabling technology Ralf Schäfer
Image formats and data rates Ralf Schäfer
Capacity of and transmission time for movies (90 min) Capacity [GB] Numberof DVDs Transm. Time [h] @ 38 Mbit/s Ralf Schäfer
10 92 31 12 1 0 0 0 10 70 30 10 0 0 0 0 14 13 5 0 0 0 0 0 10 10 10 0 0 0 0 0 15 36 3 0 0 0 0 0 10 0 0 0 0 0 0 0 4 4 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10, 70, 10, 10, 10, 30, 10, 10, 0, 0, 0, 0, .... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01, 00111, 01, 01, 01, 010, 01, 01, 000001 8 x 8 x 8 bit = 512 bit 8 x 8 x 10 bit = 640 bit 8 x 8 x 4 bit = 256 bit 000001 = EOB -> 26 bit Concept of DCT coding (JPEG, MPEG, H.26x) channel quanti-sation zig-zag scanning block scanning VLC DCT Ralf Schäfer compression factor = 512/26 20
Difference image (= 0 without motion) DCT-encoder channel + - Difference image (with motion compensation) DCT-decoder Difference image (with motion) Frame store Motion compen- sation Motion estimation Motion vectors Compression using temporal prediction Ralf Schäfer
I frames - Intracoding (JPEG) ... I B B P B B P I, P and B frames P frames - Uni-directional predictive coding B frames - Bi-directional predictive coding ... Ralf Schäfer
ISO and ITU-T standards for image & video coding D-Cinema production 500 Mbit/s JPEG-2000 TV/HDTV production 100 Mbit/s JPEG Vers. 3 20 Mbit/s HDTV MPEG-2 SDTV Vers. 2 1 Mbit/s CD-ROM MPEG-1 Vers. 1 ITU H.261 Videophone/ conference 64 kbit/s ITU H.263 MPEG-4 Mobile video services ITU/MPEG (JVT) H.264/AVC 8 kbit/s 1990 1992 1994 1996 1998 2000 2002 Ralf Schäfer
Lossy coding To channel/storage media DCT Q VLC From channel/storage media IDCT VLD Q-1 Lossless coding (JPEG/JPEG-LS) c b d Line n-1 Line n (Adapt.) Spatial prediction Entropy coding a Spatial prediction JPEG Disadvantage: Motion JPEG is not standardized! Ralf Schäfer
JPEG2000 JPEG 2000 is the successor of the JPEG standard. • Work started in 1997 • Most important criterion was the » overall environment «, in which images would be tasked in future. • JPEG2000 is a wavelet based compression, which delivers better quality than JPEG and allows »scalability« without having to store redundant data. • JPEG 2000 delivers about 20% »better compression« than JPEG. And, at more extreme compression ratios, JPEG 2000 delivers significantly better quality. • JPEG 2000 supports both »lossless and lossy« compression in a single codec – a very desirable feature in certain applications such as medical imaging and post- production. Ralf Schäfer
2D IDWT Arithmetic decoder Post- processing Coefficient Bit Model. Q-1 From channel/storage media JPEG 2000 Pre- processing 2D DWT Arithmetic Encoder Coefficient Bit Model. Q To channel/storage media Ralf Schäfer
JPEG2000: Scalability JPEG 2000 is scalable in both SNR and resolution without transcoding: • Example: Scalability in Resolution Ralf Schäfer
Motion JPEG 2000 (Part 3) • Based on Part 1 codec of JPEG2000 standard • Motion Image specific additions • intraframe based coding scheme • MPEG-4 based file format • Synchronisation of audio and video • Metadata embedding • Multi-component, multi-sampling formats e.g. YUV422, RGB 444 Ralf Schäfer
MPEG-4: New Functionalities • MPEG-4 - scene description allows: • hierarchicalstructuringof scenes • combinationof natural andsynthetic video & audio objects • interactionwith singlescene elements Ralf Schäfer
Cameras 61” Plasma display Audio Speakers Semi-circular table Immersive Conference Terminal • Seamless transition between the real and virtual world • Life-sized upper body images • Natural reproduction of gestures and body language • 3D representation of the remote participants including provision of eye contact Ralf Schäfer
„Intelligent scrambling“ for Pay-TV Broadcast of a tennis match without the players Only those paying for admission get the players Ralf Schäfer
H.264/AVC Video Coding Input Video Signal Split into Macroblocks 16x16 pixels Coder Control Control Data Transform/Scal./Quant. Quant.Transf. coeffs - Decoder Inv. Scal. & Transform Entropy Coding 0 Motion- Compensated Predictor Intra/Inter Motion Data Motion Estimator Ralf Schäfer
Common Elements with other Standards • Macroblocks: 16x16 luma + 2 x 8x8 chroma samples • Input: Association of luma and chroma and conventional sub-sampling of chroma (4:2:0) • Block motion displacement • Motion vectors over picture boundaries • Variable block-size motion • Block transforms • Scalar quantization • I, P, and B coding types Ralf Schäfer
16x16 8x8 16x8 8x16 0 MB 0 1 0 0 1 Types 2 3 1 4x8 8x8 8x4 4x4 0 1 0 8x8 0 1 0 Types 2 3 1 Motion Compensation Accuracy Input Video Signal Split into Macroblocks 16x16 pixels Coder Control Control Data Transform/Scal./Quant. Quant.Transf. coeffs - Decoder Scaling & Inv. Transform Entropy Coding De-blocking Filter Intra-frame Prediction Output Video Signal Motion- Compensation Intra/Inter Motion Data Motion Estimation Motion vector accuracy 1/4 (6-tap filter) Ralf Schäfer
Multiple Reference Frames Input Video Signal Split into Macroblocks 16x16 pixels Coder Control Control Data Transform/Scal./Quant. Quant.Transf. coeffs - Decoder Scaling & Inv. Transform Entropy Coding De-blocking Filter Intra-frame Prediction Output Video Signal Output Video Signal Motion- Compensation Intra/Inter • Multiple Reference Frames • Generalized B Frames Motion Data Motion Data Motion Estimation Ralf Schäfer
Transform Coding Input Video Signal Split into Macroblocks 16x16 pixels Coder Control Control Data Transform/Scal./Quant. Quant.Transf. coeffs - • 4x4 Block Integer Transform • Main Profile: Adaptive Block Size Transform (8x4,4x8,8x8) • Repeated transform of DC coeffs for 8x8 chroma and 16x16 Intra luma blocks Decoder Scaling & Inv. Transform Entropy Coding De-blocking Filter Intra-frame Prediction Output Video Signal Motion- Compensation Intra/Inter Motion Data Motion Estimation Ralf Schäfer
3 0 7 • e.g., Mode 3: diagonal down/right prediction a, f, k, p are predicted by (A + 2Q + I + 2) >> 2 2 8 3 4 6 1 5 Q A B C D E F G H I a b c d J e f g h K i j k l L m n o p M N O P Intra Prediction Input Video Signal Split into Macroblocks 16x16 pixels • Directional spatial prediction • (9 types for luma, 1 chroma) Coder Control Control Data Transform/Scal./Quant. Quant.Transf. coeffs - Decoder Scaling & Inv. Transform Entropy Coding De-blocking Filter Intra-frame Prediction Output Video Signal Motion- Compensation Intra/Inter Motion Data Motion Estimation Ralf Schäfer
Deblocking Filter • Improves subjective visual and objective quality ofthe decoded picture. Is significantly superior to post filtering. • Filtering affects the edges of the 4x4 block structure • Highly content adaptive filtering procedure mainly removes blocking artifacts and does not unnecessarily blur the visual content Ralf Schäfer
Deblocking Filter: Subjective Result for Inter without filter with H264/AVC deblocking Ralf Schäfer
Variable Length Coding Two schemes depending on profile: • Context adaptive VLC (CAVLC) • Context-based Adaptive Binary Arithmetic Codes (CABAC) -> 10-15% gain over CAVLC Ralf Schäfer
Grouping of Capabilities into Profiles Four profiles: Baseline, Main, Extended, and High • Baseline (Videoconferencing & Wireless) • I and P picture types (not B) • In-loop deblocking filter • 1/4-sample motion compensation • Tree-structured motion segmentation down to 4x4 block size • VLC-based entropy coding • Some enhanced error resilience features • Flexible macroblock ordering • Arbitrary slice ordering • Redundant slices Ralf Schäfer
Main and Extended Profiles • Main Profile (esp. Broadcast/Entertainment) • All Baseline features except enhanced error resilience features • B pictures • CABAC • Adaptive block-size transforms • MB-level frame/field switching • Adaptive weighting for B and P picture prediction • Note: Main is not exactly a superset of Baseline • Extended Profile • All Baseline features • B pictures • More error resilience: Data partitioning • SP/SI switching pictures • Note: Profile X is a superset of Baseline Ralf Schäfer
New Features in High Profiles • Larger transforms • 8x8 transform • Drop 4x8, 8x4, or larger, 16-point… • Filtered intra prediction modes for 8x8 block size • Quantization matrix • 4x4, 8x8, intra, inter trans. coefficients weighted differently • Coding in various color spaces • 4:4:4, 4:2:2, 4:2:0, Monochrome, with/without Alpha • New integer color transform (a VUI-message item) Ralf Schäfer
High Profiles • The High profile (HP): • Supporting 8-bit video with 4:2:0 sampling, addressing high-end consumer use and other applications using high-resolution video without a need for extended chroma formats or extended sample accuracy. • The High 10 profile (Hi10P): • Supporting 4:2:0 video with up to 10 bits of representation accuracy per sample. • The High 4:2:2 profile (H422P): • Supporting up to 4:2:2 chroma sampling and up to 10 bits per sample. • The High 4:4:4 profile (H444P): • Supporting up to 4:4:4 chroma sampling, up to 12 bits per sample, and additionally supporting efficient lossless region coding and an integer residual color transform for coding RGB video while avoiding color-space transformation error. Ralf Schäfer
Test Set Results for Streaming Application Ralf Schäfer
Subjective Comparison MPEG-2 vs. H.264/AVC Ralf Schäfer
Comparison of MPEG-2 and H.264/AVC H.264/AVC @ 512 kbit/s MPEG-2 @ 512 kbit/s CIF, 30Hz : 512 kbit/s CIF, 30Hz : 340 & 1024 kbit/s Ralf Schäfer
Comparison of MPEG-2 and H.264/AVC H.264/AVC @ 512 kbit/s MPEG-2 @ 512 kbit/s CIF, 30Hz : 512 kbit/s CIF, 30Hz : 340 & 1024 kbit/s Ralf Schäfer
Comparison of MPEG-2 and H.264/AVC H.264/AVC @ 340 kbit/s MPEG-2 @ 1024 kbit/s CIF, 30Hz : 512 kbit/s CIF, 30Hz : 340 & 1024 kbit/s Ralf Schäfer
Comparison of MPEG-2 and H.264/AVC H.264/AVC @ 340 kbit/s MPEG-2 @ 1024 kbit/s CIF, 30Hz : 512 kbit/s CIF, 30Hz : 340 & 1024 kbit/s Ralf Schäfer
H.264/AVC Adoptions and Applications • Wireless broadcast and mobile networks adoptions and applications • Optional codec in 3GPP Release 6 • Optional codec in DVB (AVC) for DVB-H • Mandatory in DMB (DAB application) • Mandatory codec in Japanese 1 Segment ISDB-T system • Broadcast adoptions and applications • Optional codec in DVB (DVB-AVC) • To be adopted as optional codec by ATSC • Optional codec in Japan (ARIB) and Korea • HDTV services via satellite (DirecTV, Echo Star, BskyB, Premiere, …) • The only mandatory codec for HDTV services in Europe (EICTA) • SDTV services via IPTV (SBC, KPN, Belgacom, France Telecom, ...) • Storage adoptions and applications • Mandatory codec for HD-DVD • Mandatory codec for Blu-ray Disk • Mandatory codec for UMD in Sony Play Station Portable 3 • Used in Apple iPod Video • Internet • Used in Adobe Flash Player Ralf Schäfer
HHI‘s role in video coding and standardization • Associated Rapporteur of ITU-T/SG 16/VCEG (T. Wiegand) 2000 - … • Co-chair of Joint Video Team (MPEG/VCEG) (T. Wiegand) 2001 - … • Co-chair of MPEG Video (T. Wiegand) 2005 - … • HHI prepared MPEG-4 reference software (RD optimised) and the H.26L proposal for the MPEG tests in 2001 -> foundation of JVT • HHI is responsible for the integration of maintenance of the official H.264/AVC reference software (K. Sühring) 2002 - ... • Editor of the H.264/AVC standard (T. Wiegand) 2002 - ... • Coordinator of video for DVB-H and editor in DVB-CBMS (T. Wiegand) 2005 • Editor of the visual parts of TS 102 005 and TS 101154 in DVB-AVC (T. Wiegand) 2003 - 2005 • Chairman of ITG-FA 3.2 „Digital Image Coding” (R. Schäfer) • Editor of the SVC standard (H. Schwarz and T. Wiegand) 2005 - ... • Chairman of 3DAV Group of MPEG (A. Smolic) Ralf Schäfer
Scalable Video Coding • Facing the scenario of heterogeneous media delivery: • Different users • Different needs • Different displays • Different links • Flexible source coding, i.e. scalability is needed • Simple adaptation to different bit-rates, frame rates or spatial resolutions of the video content on a bit-stream level Ralf Schäfer
Scalable video encoder QCIF @ 7,5 Hz CIF @ 15 Hz 32 kbit/s Data stream 2048 kbit/s 256 kbit/s CIF @ 30 Hz video decoder Sc. video-decoder Sc. video decoder Sc. video decoder 512 kbit/s TV @ 60 Hz scene 2048 kbit/s Scalable Video Coding Ralf Schäfer
Hierarchical MCP & Intra prediction texture Base layer coding motion Inter-layer prediction: • Intra • Motion • Residual Multiplex Scalable bit-stream Hierarchical MCP & Intra prediction texture Base layer coding motion Inter-layer prediction: • Intra • Motion • Residual H.264/AVC-compatible base layer bit-stream H.264/AVC MCP & Intra prediction texture Base layer coding motion H.264/AVC compatible encoder SNR Scalability: Typical Encoding Ralf Schäfer
Graceful Degradation in Video Transmission • Mobile ad-hoc networks: time varying connectivity, throughput, errors, and delay • Design a robust transmission system for video • Combine channel coding (Raptor codes) with error resilient source coding (SVC) Graceful degradation
Single source streaming in mobile ad hoc networks Ralf Schäfer
Multi source streaming in mobile ad hoc networks Ralf Schäfer
Scalability of Video - Modalities Fidelity: change of quality (e.g. SNR) Temporal: change of frame rate 30 Hz 15 Hz 7.5 Hz • Spatial: change of frame size TV CIF QCIF coarse fine
3D-Television (1) Video + Depth concept adopted by MPEG (under chairmanship of HHI) • Coding & transmission of 2D video • Generation of per pixel depth information & coding of depth map • Rendering at the decoder • Intermediate views can be generated within a certain operating range => head motion parallax viewing Ralf Schäfer
3D-Television (2) • Backward compatible to DVB • Can be decoded by any existing STB • Advanced 3D features can be used depending on functionality of an advanced STB and the attached display Ralf Schäfer
Multiview Coding (MVC) in MPEG 8 responses to the Call for Proposals on MVC had been received: • 5 from industry(-cooperations), 2 from research institutions, 1 from a university • 2 from Korea, 2 from Japan, 2 from USA, 2 from Germany Examples of test sequences for MVC test Ralf Schäfer
HHI MVC coding results summary Ralf Schäfer
H.265: Next Generation Video Coding • Objective: • Reduction of bit rate by 50% towards H.264/AVC @ equal quality • Applications: • Mobile • Internet • (Mobile) Broadcast services • Immersive entertainment services • Digital Cinema @ beyond • Robust video transmission • Interactive services with low delay Ralf Schäfer