470 likes | 761 Views
Multimedia. Video/Audio Compression. Hybrid coding. Images: JPEG Video/Audio M-JPEG MPEG (1, 2, 4) Other codings H.26x. Video Coding Requirements. Random access Fast forward /reverse searches Reverse playback Audio-visual synchronization Robustness to errors
E N D
Multimedia Video/Audio Compression T.Sharon-A.Frank
Hybrid coding • Images: • JPEG • Video/Audio • M-JPEG • MPEG (1, 2, 4) • Other codings • H.26x T.Sharon-A.Frank
Video Coding Requirements • Random access • Fast forward /reverse searches • Reverse playback • Audio-visual synchronization • Robustness to errors • Low coding/decoding delay • Editability • Format flexibility • Cost tradeoffs T.Sharon-A.Frank
Video Compression • Spatial (intra-frame) compression: • Compresses each frame in isolation, treating it as abitmapped image. • Based on quantization of DCT coefficients. • Temporal (inter-frame) compression: • Compresses sequences of frames by only storingdifferences between them. • Record displacement of object plus changed pixels in area exposed by its movement. • Based on Motion Compensation (MC). T.Sharon-A.Frank
Spatial Compression • Image compression applied to each frame. • Can therefore be lossless or lossy, but lossless rarely produces sufficiently high compression ratios for volume of data. • Lossless compression implies a loss of quality if decompressed then recompressed. • Ideally, work with uncompressed video during post-production. T.Sharon-A.Frank
Temporal Compression • Key frames are spatially compressed only • Key frames often regularly spaced (e.g., every 12 frames). • Difference frames only store the differencesbetween the frame and the preceding frame ormost recent key frame. • Difference frames can be efficiently spatiallycompressed. T.Sharon-A.Frank
Motion-JPEG (M-JPEG) • Purely spatial compression. • Apply JPEG compression to each video frame. • Compression rates: 2:1 to 12:1 • lossy: up to 5:1 is considered broadcast quality. • No standard, but MJPEG-A format widelysupported. • Excellent when there are rapid scene changes in the video. • Easy to edit. T.Sharon-A.Frank
Video Compression Coding of video is carried out in a series of steps: • Divide Image to blocks • 16x16 luminance • 8x8 chrominance (color) • Use DCT based techniques for spatial redundancy removal (Intra-frame compression). • Use MC (Motion Compensation) techniques for temporal redundancy removal (Inter-frame compression). • Final stage is two dimensional run-length coding. Usually T.Sharon-A.Frank
Three consecutive video frames T.Sharon-A.Frank
Motion Compensation • Motion compensation compensates for inter-frame differences. • Real-time communication consideration – only the closest previous frame is used for prediction to reduce the encoding delay. previous frame current frame best match T.Sharon-A.Frank
Motion Compensation Algorithm • Sends new location of block • If block changed more than a certain threshold, resends all the block • Refreshes all the image once in a while previous frame current frame best match T.Sharon-A.Frank
Frame Types in Compressed Video • Key Frame • Compression is based on content of this frame. • Difference/Delta Frame • Compression is based on last key frame. T.Sharon-A.Frank
Bi-directional Motion Compensated Interpolation T.Sharon-A.Frank
MPEG Dynamics • Delicate balance between Intra-frame and Inter-frame coding. • Two basic techniques: • Transform domain DCT-based compression for the reduction of spatial redundancy (intra-frame). • Block-based bi-directional MC for reduction of the temporal redundancy (inter-frame). T.Sharon-A.Frank
The MPEG Standard T.Sharon-A.Frank Three types of MPEG-2 frames processed by the viewing program: • I (Intracoded) frames: self-contained JPEG-encoded still pictures. • P (Predictive) frames: block-by-block difference with the last frame. • B (Bidirectional) frames: differences with the last and next frame.
Use of MPEG Image Types <I> Intra-picture/frame/image • Access points for random access • Moderate Compression <P> Predicted pictures • Coded with a reference to a past picture • Used as reference for future predicted pictures <B> Bi-directional prediction (interpolated pictures) • Require past and future reference for prediction • Highest compression T.Sharon-A.Frank
MPEG GOPs • Group of Pictures (GOP): • Repeating sequence of I-, P- and B-pictures. • Always begins with an I-picture. • Display order – frames in order they will be displayed. • Bitstream order – re-ordered so that every P- or B-picture comes after frames it depends on, allowing reconstruction of the complete frames. T.Sharon-A.Frank
A Typical MPEG Picture Display Order Forward prediction I B B B P B B B I I I? B B? 25fps (9 I/P, 17B) T.Sharon-A.Frank
A Typical MPEG Picture Bitstream Order • Transmitting order: 1, 5, 2, 3, 4, 9, 6, 7, 8Forward prediction 1 2 3 4 5 6 7 8 9 I B B B P B B B I Bi-directional prediction T.Sharon-A.Frank
MPEG Standards • MPEG-1 • 352x240 at 30 fps. • Quality is slightly below standard VCR videos. • MPEG-2 • 720x480 & 1280x720 at 60 fps, with full CD-quality audio. • Sufficient for television (including HDTV). • Used on DVD-ROMs. • MP3 • Audio compression. • Reduces digital sound files by 12:1 ratio with virtually no loss in quality. T.Sharon-A.Frank
MPEG-1 Compression • Source Interchange Format (SIF) • 4:2:0 chrominance sub-sampling • 352x240 pixel frame • MPEG-1 compressed SIF video at 30 frames persecond has data rate of 1.86Mbps (CD video – 40mins of video at that rate). • MPEG-1 can be scaled up to larger frames, but cannot handle interlacing. T.Sharon-A.Frank
MPEG Profiles & Levels • Profiles define subsets of the features of the datastream. • Levels define parameters such as frame size anddata rate. • Each profile may be implemented at one ormore levels. • Notation: profile@level, e.g. MP@ML. T.Sharon-A.Frank
MPEG-2 Main Profile & Level • MPEG-2 Main Profile at Main Level (MP@ML) used for DVD video: • CCIR 601 scanning • 4:2:0 chrominance sub-sampling • 15 Mbits per second • Most elaborate representation of MPEG-2 compressed data. T.Sharon-A.Frank
MPEG-4 (1) • Refinement of MPEG-1 compression: • I-pictures compressed by quantizing and Huffman coding DCT coefficients. • Improved motion compensation leads to better quality than MPEG-1 at same bit rates. • Designed to support a range of multimedia data at bit rates from 10Kbps to >1.8Mbps. • Applications from mobile phones to HDTV. • Video codec becoming popular for Internet use – is incorporated in QuickTime, RealMedia andDivX. T.Sharon-A.Frank
MPEG-4 (2) • Standard defines an encoding for multimedia streams made up of different sorts of object –video, still images, animation, 3-D models… • Higher profiles divide a scene into arbitrarilyshaped video objects were each one may be compressed and transmitted separately; scene iscomposed at receiving end by combiningthem. • SP and ASP profiles restricted to rectangular objects, usuallycomplete frames. T.Sharon-A.Frank
MPEG-4 Profiles & Levels • Simple Profile (SP), suitable for lowbandwidth streaming over Internet: • P-pictures only • Efficient decompression, suitable for PDAs, etc • SP@L1, 64 kbps, 176x144 pixel frame. • Advanced Simple Profile (ASP) suitablefor broadband streaming: • B-pictures • Global Motion Compensation • Sub-pixel motion compensation • ASP@L5, 8000 Kbps, full CCIR 601 frame. T.Sharon-A.Frank
DV Compression • Starts with chrominance sub-sampling of CCIR 601. • Constant data rate 25Mbits per second; higher quality than MJPEG at same rate. • Apply DCT, quantization, run-length andHuffman coding on zig-zag sequence – like JPEG – to 8x8 blocks of pixels. • If little or no difference between fields (almost static frame), apply DCT to block containing alternate lines from odd and even fields. • If motion between fields, apply DCT to two 8x4 blocks (one from each field) separately, leading to more efficient compression of frames with motion. T.Sharon-A.Frank
DVI (Digital Video Interactive) • Developed by General Electric. • Uses specialized processors for compression. • Hardware-only codec – lossless transforms. • Compression rate: 80:1-160:1 • 10 sec video clip is compressed to ~2MB. • Intel – software version of DVI algorithms, marketed as Indeo (a software only codec): • there is also an audio version of Indeo. • latest version uses hybrid wavelet transform for compression algorithm. T.Sharon-A.Frank
Cinepak • Developed by Apple and SuperMac. • Outputs 320x240 (quarter screen) at 15 fps with good quality • data rate that even slow single-speed and 2x CD-ROM players can deliver. • Software only codec supported by Microsoft’s Video for Windows and Apple’s QuickTime. • Better color definition than other codecs, so good for natural video without graphics or animation. T.Sharon-A.Frank
QuickTime • Developed by Apple but is now cross-platform. • Supports Cinepak, Indeo, M-JPEG and MPEG-1, and is extensible to support future codecs, such as DVCAM. • Synchronizes all types of digital media. • For example, video frames are dropped if necessary for synchronization with audio. T.Sharon-A.Frank
Video For Windows • Microsoft (therefore, not cross-platform). • Uses generic AVI (audio video interleaved) format which is provided by MCI (media control interface). • Supports a number of compression methods in real-time, non-real-time, with or without hardware assistance • Cinepak, Indeo, Microsoft Video-1. T.Sharon-A.Frank
ActiveMovie (API from Microsoft) • Now called DirectShow (supports DVD). • Solves problems of VfW and QuickTime. • Cross-platform. • Supports codecs supported by VfW as well as MPEG audio, WAV audio, MPEG video, and Apple QuickTime video. • Fully integrated with DirectX technology, allowing use of DirectX components and more graphics card features. T.Sharon-A.Frank
Video Streaming Players • RealVideo (from RealNetworks) • G2 Player also plays RealAudio. • Uses a variety of compression techniques. • RealProducer (also from RealNetworks) • Allows you to create streaming audio and video. • Free software just like G2! T.Sharon-A.Frank
H.261 (Px64) • Video compression for videoconferences • Compression in real-time • Targeted to ISDN • Compressed data stream: p*64 Kbits/s, p=1, …, 30) • 2 resolutions: • Common Intermediate Format (CIF) • Quarter CIF (QCIF) T.Sharon-A.Frank
H.261 (Px64) Resolutions • Common Intermediate Format (CIF) • Quarter CIF (QCIF) T.Sharon-A.Frank
Image Preparation • Uncompressed CIF • One frame = 288*352*8 + 2*144*176*8 = 1,216,512 bits • 30 fps • Bandwidth = 1,216,512*30 = 36.4 Mbits/s • Uncompressed QCIF = 9.1Mbits/s • ISDN channels: 64Kbits/s-2Mbits/s => bit reduction required T.Sharon-A.Frank
Desktop Videophone Applications • Channel capacity (p=1) = 64Kbits/s • QCIF at 10 fps --> 3 Mbits/s • Required compression ratio = 3Mbs/64Kbs=47 • Channel capacity (p=10) = 640Kbits/s • CIF at 30 fps --> 36.4 Mbits/s • Required compression ratio = 36.4Mbs/640Kbs=57 T.Sharon-A.Frank
Audio Compression • In general, lossy methods required because of complex and unpredictable nature of audio data. • CD quality, stereo, 3-minute song requires over 25 Mbytes • Data rate exceeds bandwidth of dial-up Internet connection. • Difference in the way we perceive sound and image means different approach from image compression is needed.
Audio Compression Techniques T.Sharon-A.Frank
Standards of Speech Encoding T.Sharon-A.Frank
Basic Steps of Audio Encoding Uncompressed audio data 32 Sub-Bands Filter-Banks [] Quantization MultiplexerEntropyCoder PsychoacousticalModel Control Compressed audio data T.Sharon-A.Frank
MP3 • MP3 = MPEG-1 Audio, Layer 3 • Three layers of audio compression in MPEG-1 (MPEG-2 essentially identical). • Layer 1...Layer 3, encoding proces increases in complexity, data rate for same quality decreases • e.g. Same quality 192kbps at Layer 1, 128kbps at Layer 2, 64kbps at Layer 3. • 10:1 compression ratio at high quality. • Variable bit rate coding (VBR).
Voice Quality - QoS The Objective: Provide unfailing, ubiquitous, toll quality service Area of Unacceptable Operation 400 One-Way Delay (ms) Service Level Agreement Violation 200 The Challenge: Eliminate the impact of delay-insensitive traffic on real-time traffic Marginal Acceptance 160 Acceptable Operation 0 0 1 5 10 Packet Loss (%) high threshold low threshold T.Sharon-A.Frank
QoS Parameters Delay Budgets T.Sharon-A.Frank