320 likes | 510 Views
Principles of Video Compression. Dr. S. M. N. Arosha Senanayake, Senior Member/IEEE Associate Professor in Artificial Intelligence Room No: M2.06 Email: arosha.senanayake@ubd.edu.bn.
E N D
Principles of Video Compression Dr. S. M. N. Arosha Senanayake, Senior Member/IEEE Associate Professor in Artificial Intelligence Room No: M2.06 Email: arosha.senanayake@ubd.edu.bn Source: Chapter 3 of JPEG—Still Image Compression StandardJPEG2000 Standard for Image Compression: Concepts, Algorithms and VLSI Architecturesby Tinku Acharya and Ping-Sing Tsai
Topics today… • Introduction • Temporal Redundancy Reduction • Coding for Video Conferencing (H.261, H.263) SS-4306
Introduction • Reduce video bit rates while maintaining an acceptable image quality • Exploit strong correlation both between successive picture frames and within the picture elements themselves • Insensitivity of the human visual system to loss of certain spatio-temporal visual information • Uses Interframe predictive coding • H.261, H.263, MPEG-1, 2 and 4 SS-4306
Introduction [2] • Fundamental redundancy reduction principles: • Spatial redundancy reduction • Temporal redundancy reduction • Entropy coding SS-4306
Temporal Redundancy Reduction • Use Interframe coding • Static parts of the image sequence, temporal differences will be close to zero, and hence are not coded • Parts that change between the frames, either due to illumination variation or to motion of the objects, result in significant image error, which needs to be coded SS-4306
Temporal Redundancy Reduction [2] Interframe Motion compensated Interframe SS-4306
Temporal Redundancy Reduction [3]MOTION ESTIMATION • Estimate the motion of moving objects by block matching algorithm (BMA) • Divide the frame into blocks of M × N pixels usually, square blocks of N2 pixels • For a maximum motion displacement of w pixels per frame • Match the current block of pixels against a corresponding block at the same coordinates but in the previous frame, within the square window of width N + 2w • Find the displacement on the basis of match criterion for best match SS-4306
Temporal Redundancy Reduction [4]MOTION ESTIMATION The current and previous frames in a search window SS-4306
Temporal Redundancy Reduction [4]MOTION ESTIMATION • Matching Functions • Mean Squared Error • Mean Absolute Error • To reduce processing cost, MAE is preferred to MSE and hence is used in all the video codecs SS-4306
Temporal Redundancy Reduction [5]MOTION ESTIMATION • BMA in simple case requires (2w+1)2 computations • Costly • Motion estimations comprise almost 50–70 per cent of the overall encoder's complexity • Faster Motion estimation is required !!! SS-4306
Temporal Redundancy Reduction [6]FASTER MOTION ESTIMATION • Reduce the number of search points by selectively checking only a small number of specific points • Assumption behind this being the distortion measure monotonically decreases towards the best matched point • Approaches • Two-dimensional logarithmic (TDL) • Three-step search (TSS) • Modified motion estimation algorithm (MMEA) SS-4306
Temporal Redundancy Reduction [7]FASTER MOTION ESTIMATION Two dimensional Logarithmic Search SS-4306
Temporal Redundancy Reduction [8]FASTER MOTION ESTIMATION • Try to find the maximum number of steps to reach the best estimation !!! SS-4306
The assumption of monotonic variation of image intensity methods perform well for slow moving objects, such as those in video conferencing often converge to a local minimum of distortion subsample the image to smaller sizes, such that the motion speed is reduced by the sampling ratio Hierarchical block matching algorithm (HBMA) Temporal Redundancy Reduction [9]HIERARCHICAL MOTION ESTIMATION A three-level image pyramid SS-4306
Temporal Redundancy Reduction [10]GENERIC INTERFRAME VIDEO CODEC Generic Interframe encoder used in standard video codecs, such as H.261, H.263, MPEG-1, MPEG-2 and MPEG-4 SS-4306
Temporal Redundancy Reduction [11]GENERIC INTERFRAME VIDEO CODEC Generic Interframe decoder SS-4306
Coding for Video Conferencing (H.261) • Allows bitrates between approximately 64 kbit/s and 1920 kbit/s • Interframe DCT-based coding technique • Interframe prediction is first carried out in the pixel domain • The prediction error is then transformed into the frequency domain, where the quantization for bandwidth reduction takes place • Motion compensation can be included in the prediction stage, although it is optional SS-4306
Two types of frames I-Frame P-Frame I-Frame is usually sent a couple of seconds Motion vectors are always measured in the neighborhood of 15 pixels Coding for Video Conferencing (H.261) [2] Frame Sequence SS-4306
Coding for Video Conferencing (H.261) [3] I-Frame Coding SS-4306
Coding for Video Conferencing (H.261) [4] P-Frame Coding SS-4306
Coding for Video Conferencing (H.261) [5] • Quantization • Step size is fixed, 31 even levels from 2 62 • scale between 1 to 31 • Exception : DC coeff in I-Frame , step size is 8 always used SS-4306
Coding for Video Conferencing (H.261) [6] Encoder SS-4306
Coding for Video Conferencing (H.261) [7] Decoder SS-4306
Four Layers Picture Layer Group Of Blocks (GOB) Layer 11 x 3 Macroblocks, a GOB CIF contains 2x6 GOBS QCIF contains 3 GOBS Macroblock Layer Block Layer Video Bitstream Syntax Syntax of H.261 video bitstream SS-4306
H.263 • An improved video coding standard for video conferencing & other audio visual services • Aimed at low bitrate communications of less than 64kbps • Predictive coding for inter-frames • Transform coding for intra-frames & difference macroblocks from inter-frame prediction • Supports notion of GOBs SS-4306
H.263 Motion Compensation • Predicted MV of the current block • Finding MV when current block is on the border SS-4306
Motion compensation involves half pixel precision H.263 Motion Compensation [2] SS-4306
H.263 Optional Coding Modes • Unrestricted motion vector mode • Syntax based arithmetic coding (SAC) • Advanced prediction mode • PB-Frames SS-4306
Reference:1. Chapter 3 of Principles of Video CompressionStandard Codecs: Image Compression to Advanced Video Coding by Mohammed Ghanbari 2. Chapter 10 of Ze-Nian Li & Mark S. Drew, "Fundamentals of Multimedia", Pearson Education, 2004 SS-4306