820 likes | 1.1k Views
影像壓縮技術. Course Outline. Introduction Video Coding Motion Compensated Prediction & Color Format JPEG/JPEG2000 H.261,H.263,H.263+,H.264 MPEG-1,-2,-4 Error Concealment Rate Control. Standards Comparison. - still image coding. JPEG. - DCT + VLC. - simple hardware, low cost.
E N D
Course Outline • Introduction • Video Coding • Motion Compensated Prediction & Color Format • JPEG/JPEG2000 • H.261,H.263,H.263+,H.264 • MPEG-1,-2,-4 • Error Concealment • Rate Control
Standards Comparison - still image coding JPEG - DCT + VLC - simple hardware, low cost - video-conferencing (64 kb/s - 1.92 Mb/s) - DCT + VLC + optional integer-pixel MC - progressive video H.261 - improved H.261 with four optional modes H.263 - improved H.263 with 12 new optional modes - error coding H.263+
MPEG的全名為Moving Pictures Experts Group,由國際標準組織 (International Organization for Standardization, ISO)與國際電工委員會(International Electrotechical Commission, IEC)於1988年聯合成立,致力於制定數碼活動圖象及其伴音的編碼標準。 • 目前常用的數碼視訊有3個MPEG標準;MPEG-1,MPEG-2及MPEG-4。而MPEG-7及MPEG-21仍在發展階段。VCD及DVD等的數碼影音系統的編碼便用上了MPEG-1和MPEG-2。而MPEG-4的應用,主要是在低頻寬的場合例如無線電、互聯網等的地方。MPEG-7及MPEG-21的特點則會加強影音內容數據庫及管理方面的功能。是屬於未來的標準。
MPEG-1, MPEG-2, AND MPEG-4 (Motion Picture Experts Group) • MPEG-1 (1992): • 1.5 Mb/s • CD-ROM • MPEG-2 (1994): • 4 Mb/s to 80 Mb/s • DVD, Digital TV, HDTV • MPEG-4 (1999): • 5 kb/s - 50 Mb/s • Flexible networked multimedia applications
MPEG Video • MPEG-1 • 1.2 to 1.5 Mbps (for digital storage media) • MPEG-2 • Wider range of bit rates,optimized for 4 to 15 Mbps • Supports interlaced video • Supports scalable coding
MPEG 1 • 是針對1.5Mbps以下數據傳輸率的數字存儲媒質運動圖像及其伴音編碼的國際標準。MPEG1用於在CD—ROM上存儲同步和彩色運動視頻信號。可優化為中等分辨率,並在其優化模式下,採用所謂的標準交換格式(SIF)。MPEG1對色差分量採用4:1:1的二次採樣率。MPEG1旨在達到VRC質量,其視頻壓縮率為26:1。
Video Video de- en- coder coder Trans- Storage Storage Storage Editing port Audio Audio de- en- coder coder MPEG-1 Video Coding Standard
Important Features for MPEG-1 Applications • Normal playback • Random access • Reverse playback • Fast forward / reverse searches • Audio-visual synchronization • Robustness to errors • Editability • Format flexibility • Cost tradeoffs
Format 標準交換格式(525) Signal component Pixels/Line Lines/Frame 352 Luminance (Y) 240 176 Chrominance (Cb) 120 176 Chrominance (Cr) 120 Typical MPEG-1 Video Source Format • Uncompressed bit-rate for transmitting SIF at 30 fps is 30.4 Mb/s
Six Hierarchical Layers of MPEG • Sequence: Random access unit for context • GOP (Group of Pictures): Random access unit for video. Smallest independent coding unit in sequence. • Picture: Primary coding unit. Intra-Frame (I) • Predicted-Frame (P) • Bidirectional-Predicted-Frame (B) • Slice: Resynchronization unit. • MB (Macroblock)(16x16): Motion compensation unit. • Block(8x8): DCT unit.
Group of Picture (GOP) • Contains I,P,and B pictures • N=number of pictures in a GOP • M=prediction distance Forward Prediction Backward Prediction I B B P B B P B B P B B P B B I 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Example of Temporal Picture Structure Forward Prediction 0 1 2 3 4 5 6 7 8 9 10 11 12 I B B P B B P B B P B B I Bidirectional Prediction Display Order I B B P B B P B B P B B I 0 1 2 3 4 5 6 7 8 9 10 11 12 Coding Order I P B B P B B P B B I B B 0 3 1 2 6 4 5 9 7 8 12 10 11
Coding Structure of GOP I = Intra-Picture Coding, allow random access, for reference P = Predictive coding, causal prediction only, can be referenced B = Bi-directional coding, never referenced
Frame Reordering Encoder Input: 1I 2B 3B 4P 5B 6B 7P 8B 9B 10I 11B 12B 13P 14B 15B 16P GOP2 GOP1 Decoder Input: 1I 4P 2B 3B 7P 5B 6B 10I 8B 9B 13P 11B 12B 16P 14B 15B GOP2 GOP1 OPEN CLOSED
BEST MATCH BEST MATCH Bi-directional Prediction • Prediction from the previous frame, or the prediction from the future frame, or an average of both can be used as the final prediction The prediction error is then coded and transmitted PREVIOUS P-FRAME CURRENT B-FRAME FUTURE P-FRAME
BI-Directional Motion Estimation • Forward, backward, or average prediction: • one or two motion vectors per 16x16 block
Forward/Backward/Interpolative Decision • In B frame, for each input macroblock, • calculate the construction • With forward motion vectors • With backward motion vectors • Weighted average of the forward and backward Selection is based on minimization of error measurement (MSE,MAE,etc)
MPEG Encoder Regulator + Variable Length Coding Frame Re-Order Motion Estimation DCT Quantizer Multiplexing Buffer + Source Input Picture - Inverse Quantizer IDCT + + + Predictor Vectors Modes
Variable Length Decoding Buffer IDCT Display Buffer Adder Previous Picture Store Forward Motion Compensation Interpolated Motion Compensation Future Picture Store Backward Motion Compensation Decoded Video MPEG Decoder
MPEG Picture Types • Generated number of bits: I > P > B. For example, I~300 kbits, P~100-65 kbits (fast/slow motion), B~18-7 kbits (fast/slow motion), per frame. • B Pictures: • Best prediction and compression, object occlusion and entrance into scene, noise averaging. • Encoder delay, high complexity, large encoder buffers required.
MPEG-1 Picture MACROBLOCK SLICE 1 SLICE 2 SLICE 2 SLICE 3 Cb Chrominance SLICE 14 SLICE 1 SLICE 15 SLICE 2 SLICE 1 SLICE 3 SLICE 2 SLICE 3 Y LUMINANCE Cr Chrominance SLICE 14 SLICE 14 SLICE 15 SLICE 15
Motion Estimation/Compensation • Motion estimation on 16x16 luminance blocks • Chrominance motion vectors by dividing • luminance motion vectors and truncating • Half-pel update on integer motion vectors for • improved performance • Supports maximum motion vector range of • -512 to +511.5 pels for half-pel motion vectors • -1024 to +1023 for full pel
MPEG-1 Constraint Parameter Set • Horizontal size <= 720 pels • Vertical size <= 576 pels • Total number of Macroblocks/picture <= 396 • Total number of Macroblocks/second <= 396x25 = 330x30 • Picture rate <= 30 frames/second • Bit rate <= 1.86 Mbits/second • Decoder Buffer <= 376832 bits
MPEG-1 Compare to H.261 • Bi-directional motion compensation with half-pixel accuracy • Visually weighted quantization • In Intra-mode, the DC-coefficient is encoded similar to that in JPEG • I, P, and B picture types organized as a flexible Group of Pictures (GOP) • Slice structure instead of Group of Blocks (GOB) • Support maximum motion vector range of -512 to +511.5 pixels, • for half-pixel motion vectors: -1024 to + 1023 for full-pixel • Flexible format: picture sizes up to 4k x 4k, 360 x 240 (SIF) normally • used. Variety of picture rates: 23.98, 24, 25, 29.97, 30, 50, 59.94, 60
Simulation Model 3 (SM3) • A specific reference implementation of MPEG-1 encoder including details which were not specified in the standard • Motion estimation: one forward and/or one backward vector per MB with half-pixel resolution; 2-step search: (i) full search in the range of +/- 7 pixels (2) search 8 neighboring half-pel positions • Methods for MC / No MC and Intra / Inter decision • Quantizer, rate control
SUMMARY • MPEG-1 is mainly for storage media and broadcasting applications • Due to the use of B-pictures, it may result in long end-to-end delay • MPEG encoder is much more expensive than the decoder due to the motion estimation which has large search range and may have half-pel accuracy • MPEG-1 syntax can support a variety of rates and formats for storage media applications • Pre-processing, encoding, and post-processing are open to improvement • Extensions to include added features are possible
MPEG-2 Video Standard Standardization established in 1995 A generic video codec to address a wide variety of application at rate 4Mbit/s ~ 80Mbit/s
MPEG 2 • 目前MPEG-2的應用主要是針對3~10Mbps的影音圖象數據。MPEG-2可以提供一個較廣的範圍改變壓縮比,以適應不同畫面質量、存儲容量和帶寬的要求。MPEG-2可以將一部120分鐘長的電影壓縮到4~8GB可供收錄在DVD碟片之內。MPEG-2的音頻編碼可提供左、右、中及兩個環繞聲道、一個加重低音聲道和多達7個伴音聲道,因此DVD可有8種語言配音。除了作為DVD的指定標准外,MPEG-2還可用於為廣播、有線電視網、電纜網絡等提供廣播級的數字視頻。但由於現在電視機的解析度參差,在播放DVD時,觀眾不一定領略到MPEG-2所帶來的高清晰度畫面質量。但觀眾一定可以感受到其綽越的音頻特性,例加多聲道等的效果
MPEG-2 Video Standard • Primarily for coding interlaced video at 4 - 15 Mb/s for digital broadcast TV and high quality Digital Storage Media; also for HDTV, Cable/Satellite TV, video services over networks (e.g., ATM), and 2-way communications • Started late 1990 after completion of technical work of MPEG-1 • Competitive tests of video algorithms held in Nov. ‘91 • Collaborative phase for developing video coding algorithm • Committee Draft for video part achieved Nov. ‘93 • Standard specifies only bitstream syntax and decoding process
MPEG2 Standards ISO-IEC/JTC1/SC29/WG11 ITU-T ATM Video Coding Experts Group 11/93 11/93 11/93 11/94 11/94 03/95 07/96 10/95 03/95 ISO/IEC 13818 1) Systems 2) Video 3) Audio 4) Conformance Testing 5) Simulation Software Technical Report 6) Digital Storage Media Control Commands 7) Non-Backward Compatible Audio 8) 10-bit Video 9) Real-Time Interface ... ITU-T H.262: MPEG-2 Video
MPEG2 STANDARDS • MPEG-2的應用環境 音訊 介面 立體聲 MPEG-2 音訊解碼 音訊輸出 已壓縮 MPEG資料流 解調 子系統 陸地/人造衛星 廣播或有線電視 MPEG-2 傳輸解多工 DRAM 調變 子系統 壓縮MPEG 資料流 NTSC/PAL 編碼器 MPEG-2 視訊解碼器 視訊/音訊 複合式 視訊輸出 MPEG-2 編碼器 磁碟 輸入 DRAM
Features • Picture quality - good quality NTSC (4-6 Mb/s) • excellent quality NTSC (8-10 Mb/s) • Random access/channel switching in limit time - intra-pictures • Trick modes - basic VCR functions • Delay - low delay mode using Simple Profile for visual communications • Error resilience - intra-mv, data-partitioning, priority assignment to video • layers • Allow higher chroma resolution - e.g. 4:2:2 and 4:4:4 • Scalability - Data partition, SNR scalability, Spatial Scalability, • Temporal scalability, Hybrid scalability (up to 3 layers) • Compatibility - decodes MPEG-1 bit-stream, base layer may be decoded • by MPEG-1 decoder • Support multiple video formats and frame rates • Subset of the standard permit real-time encoders of reasonable complexity
Applications • Each profile supports groups of features for an application area • Simple Profile: low-delay videoconferencing • Main Profile: most important, for general applications • SNR Profile: multiple grades of quality • Spatially Scaleable Profile: multiple grades of quality and resolution • High Profile: multiple grades of quality, resolution, and chroma format SP MP SNRP SSP HP
PROFILES AND LEVELS Profile Spatially Scalable 4:2:0 SNR Scalable 4:2:0 High 4:2:0 or 4:2:2 Main 4:2:0 Simple 4:2:0 Level High 1920x1152 (60 frames/s) 100 Mbit/s for 3 layers 80 Mbit/s High-1440 1440x1152 (60 frames/s) 60 Mbit/s for 3 layers 60 Mbit/s 80 Mbit/s for 3 layers Main 720x576 (30 frames/s) 15 Mbit/s for 2 layers 15 Mbit/s 15 Mbit/s 20 Mbit/s for 3 layers Low 352x288 (30 frames/s) 4 Mbit/s for 2 layers 4 Mbit/s * numbers in the table are maximum allowed
Requirements • CCIR-601 interlaced video with high quality at 4 to 9 Mbps • Random access/channel switching in limited time: • Frequent access points • Fast forward/reverse • Seek and play in FF/FR using access points • Allow video coding higher chroma resolution formats • e.g.4:2:2 and 4:4:4 • High quality low delay video coding for video • communications • Scalable video coding for multi-quality video • applications
MPEG-1 & MPEG-2 Coding of interlaced picture Scalability Allow a receiver to decode a subset of the full bitstream in order to display an image sequence at a reduced quality, spatial and temporal resolution
New Feature OF MPEG-2 Frame/field picture structure Frame/filed dual prime adaptive motion compensation Frame/filed adaptive DCT Alternate scan for DCT coefficients Picture format: (4:2:0),(4:2:2),(4:4:4) Nonlinear quantization table
MPEG-2: Resolutions & Formats • Picture sizes extension up to 16k x 16 k; 720 x480 ~ TV resolution • Support picture rates: 23.98, 24, 25, 29.97, 30, 50, 59.94, 60 • Support both progressive and interlaced formats • Support 4:2:0, 4:2:2, and 4:4:4 sampling formats
Interlaced Video Coding 1) Frame / Field motion compensated predictive coding prediction modes: frame, field, dual prime 2) Frame / Field DCT 3) Progressive / Interlaced scan
Frame/Field Format for ME Field 16x8 blocks Frame macroblock 8 16 16 16 8 16
Low Delay Coding • For face-to-face applications • Total encoding and decoding delay of less than 150 ms can be achieved • Low delay coding by not using B-pictures, using dual-prime prediction for P-frames, intra slices, skip frames
TEST MODEL 5 (TM5) - Frame/field/dual prime and forward/backward ME - Integer pel full search followed by half-pel search - MPEG-1 mode decision: MC/no MC, inter/intra - MPEG-1 and nonlinear quantizer tables - Zigzag scan for inter; alternate scan for intra coding - Quantizer and rate control
MPEG-4 • An emerging coding standard • Content-based interactivity:To interact with meaningful objects in audiovisual data • Universal access:Access to audiovisual data can be available over a wide range of storage and transmission media • High compression:Especially at low bit rates • Flexible syntax for downloadable algorithms TV/Film Computer MPEG-4 Telecommunication Wireless,Internet,WWW,ISDN,POTS,Cable
MPEG-4 Applications • Audiovisual communications and messaging • Multimedia database access • Remote monitoring, surveillance and control • Video on LAN, Internet multimedia, Wireless video • Interactive TV, tele-shopping, home movies • Collaborative environment, distance learning, virtual reality games • Audio/video streaming
MPEG-4 Products • Microsoft supports MPEG-4 video in MS Media Player • Sharp (JP) introduced MPEG-4 ViewCam in January ‘99 • Toshiba MPEG-4 chip • Japan announced the use of MPEG-4 video for wireless • services in the IMT2000 project • PacketVideo Inc. provides technology for streaming MPEG-4 • video over Internet and 2nd/3rd generation wireless • networks