320 likes | 504 Views
Hash Codes for Motion Estimation. Reda Dehy Mar.11, 05 Guillem Pratx Hochong Wu. Input Video. Hash Code. Hash Transform. Encode. “Side Info”. Hash Code. Motion Estimation. Motion Info. Decode. Introduction: Hash-based ME. Distributed Coding Simple Encoder
E N D
Hash Codes for Motion Estimation Reda Dehy Mar.11, 05 Guillem Pratx Hochong Wu
Input Video Hash Code Hash Transform Encode “Side Info” Hash Code Motion Estimation Motion Info Decode Introduction: Hash-based ME • Distributed Coding • Simple Encoder • Block-based Motion Compensation Dehy, Pratx, Wu
Introduction: Hash-based ME (2) • Low-rate hash code that can be used instead of the original blocks to perform motion estimation at a remote machine. Dehy, Pratx, Wu
Outline • Constraints & Assumptions • Downsampling / Bit Mask • DCT-based Hash Code • Feature-based Hash Code • Results • Conclusions Dehy, Pratx, Wu
1.Constraints & Assumptions • 8x8 blocks. • Bit rate < 0.15 bits-per-pixel (bpp). • Assume previous frame is reconstructed perfectly at the decoder. • Integer-pel accuracy block-based motion estimation. • No processing done in the encoder other than hash transforms. • No constraints for speed of execution (e.g. no fast block search algorithms implementation). Dehy, Pratx, Wu
2. Downsampling/ Bit-Mask Hash • Original Block Hash Block • 1-Quantize Average pixel value per block (step size Q) • 2-Encode using DPCM, send hash bits directly • Other block decompositions possible (2x2, 3x3) average Dehy, Pratx, Wu
Motion Estimation • Search inside 30x30 window for x and y • Minimize cost function D=|hash-block(x,y)|+|AH-AB|, where |hash-block| is SAD between hash block (mean AH) and a nearby block in previous frame (mean AB) • =0.1*Q-1 • If 2 blocks have the same D(x,y), choose the one that minimizes (x-x0)2+ (y-y0)2 , where (x0,y0) is the position of the motion estimated block Dehy, Pratx, Wu
Results: Rate-PSNR curves (1) Dehy, Pratx, Wu
Results: Rate-PSNR curves (2) Dehy, Pratx, Wu
Results: Rate-PSNR curves (3) Dehy, Pratx, Wu
Results: Rate-PSNR curves (4) Dehy, Pratx, Wu
Results: Bit-Mask Hash Method • Varies depending on the hash block size and its spatial orientation, best is 2x2 • Varies significantly between sequences • DC code clearly improves performance • Still far from maximum achievable PSNR • Not efficient for higher bit rates • Sensitive to rotation, luminosity changes, edge displacement… • Overall good results and simple implementation Dehy, Pratx, Wu
3. DCT-Based Hash Code (1) • Widely used in image processing. • Existing knowledge about the DCT. • Intuitive • DC & LFs concentrate energy • HFs give more texture information • Scalable • Accommodate different bit-rates. (trade-off) Dehy, Pratx, Wu
DCT-Based Hash Code (2) • Hash Transform Orig 8x8 Block 8x8 DCT Transform Quantize each coefficient 101000 00 00 00 00 00 1010 00 00 00 00 20 00 00 00 00 00 00 20 00 00 00 00 00 00 20 00 00 00 00 00 00 20 00 00 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30 19 17 25 33 41 49 57 210 18 26 34 42 50 58 3 11 19 27 35 43 51 59 4 12 20 28 36 44 52 60 5 13 21 29 37 45 53 61 6 14 22 30 38 46 54 62 7 15 23 31 39 47 55 63 8 16 24 32 40 48 56 64 Sample Quantization Table DCT Coeff Indexing Convention Dehy, Pratx, Wu
19 17 25 33 41 49 57 210 18 26 34 42 50 58 3 11 19 27 35 43 51 59 4 12 20 28 36 44 52 60 5 13 21 29 37 45 53 61 6 14 22 30 38 46 54 62 7 15 23 31 39 47 55 63 8 16 24 32 40 48 56 64 19 17 25 33 41 49 57 210 18 26 34 42 50 58 3 11 19 27 35 43 51 59 4 12 20 28 36 44 52 60 5 13 21 29 37 45 53 61 6 14 22 30 38 46 54 62 7 15 23 31 39 47 55 63 8 16 24 32 40 48 56 64 DCT-Based Hash Code (3) • Entropy Encoding / Decoding • DC Coeffs: DPCM + VLC • AC Coeffs: Zig-zag scan + VLC Dehy, Pratx, Wu
DCT-Based Hash Code (4) • Motion Estimation • [-15:15] x [-15:15] search window • MSE computed in hash domain • Prefer localized motion vectors Dehy, Pratx, Wu
DCT-Based Hash Code (5) • Parameter Optimization • Quantization Table is the core, determines Rate-Distortion performance • Parameters • DCT Coeffs. To use • Quantizer Step Sizes • Training Algorithm 101000 00 00 00 00 00 1010 00 00 00 00 20 00 00 00 00 00 00 20 00 00 00 00 00 00 20 00 00 00 00 00 00 20 00 00 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30 Sample Quantization Table Dehy, Pratx, Wu
DCT-Based Hash Code (6) • Results frames 10—20 • Training set first 2 frames Dehy, Pratx, Wu
4. Feature based Hash Code • Small rotations, scale and luminosity changes invariant • Gradient computed over the whole image (no boundary artifact) Block Gradient map Hash Code Average + Optional DC component Dehy, Pratx, Wu
0 1 0 1 -1 -1 1.5% 1.2% 2.5% -1 -1 0.2% 0.5% 1.0% 4.1% 78.7% 3.6% 0 0 2.6% 88.6% 2.7% 1.7% 2.7% 3.5% 1 1 1.2% 2.4% 0.7% Quantization and coding (1) (-1,1) • 3-levels quantization for each vector : • One parameter : Threshold • X and Y coordinate for each sub-block are encoded jointly using Variable Length Codes : (-1,0) Rate : 0.12 bpp Rate : 0.087 bpp Dehy, Pratx, Wu
Quantization and coding (2) • The DC component is uniformly quantized and DPCM encoded. • For lower rates, the DC component can be dropped Dehy, Pratx, Wu
Motion estimation (1) • Find in a 10 x 10 window in frame 1 the block that minimize the following distortion: D(k,l) = ||hash2-hash1(k,l)||2 + a.|DC2-DC1(k,l)|2 + b. Δ(k,l) • a, b : Weight coefficients • Δ(k,l) : measure of the intensity continuity between adjacent blocks. Dehy, Pratx, Wu
Motion estimation (2) • Δ(k,l) : • Edges : The difference is computed only if one of the gradients (given by the received hash codes) is 0. • This reduces blocking artifact and improves PSNR without inscreasing the rate Previously found blocks Dehy, Pratx, Wu
Blocking Artifact Dehy, Pratx, Wu
Applying continuity rule (1) Dehy, Pratx, Wu
Applying continuity rule (2) Dehy, Pratx, Wu
Results Rate-Distortion curves for 3 sequences, using feature based hash codes Dehy, Pratx, Wu
5. Results Dehy, Pratx, Wu
6. Conclusions • Satisfactory motion estimation can be performed at the decoder by transmitting a low-bit rate (R<0.15bpp) hash code as side information. • Downsampling Approach • DCT-based Approach • Feature-based Approach • Complexity vs. performance Dehy, Pratx, Wu
Acknowledgments • We’d like to thank Prof. Girod, David, and Shantanu. Dehy, Pratx, Wu
References • [1] B. Girod, A. Aaron, S. Rane and D. Rebollo-Monedero, "Distributed Video Coding", in Proc. IEEE, Special Issue on Advances in Video Coding and Delivery, 2003 • [2] Shantanu Rane, "Hash-Aided Motion Estimation and Rate Control for Distributed Video Coding," EE 392J Digital Video Processing Course Project, winter 2004 • [3] A. Aaron, S. Rane, and B. Girod, "Wyner-Ziv Video Coding with Hash-Based Motion Compensation at the Receiver," Proc. IEEE International Conference on Image Processing, ICIP-2004, Singapore, Oct. 2004 • [4] A. D. Wyner and J. Ziv, “The rate-distortion function for source coding with side information at the decoder,” IEEE Trans. Inf. Theory, vol. IT-22, no. 1, pp. 1–10, Jan. 1976. • [5] David G. Lowe - Distinctive Image Features from Scale-Invariant Keypoints, January 5, 2004 Dehy, Pratx, Wu