Efficient Motion Field Representation in the Wavelet Domain for Video Compression

Efficient Motion Field Representation in the Wavelet Domain for Video Compression Xin Li and Shawmin Lei Image Coding and Communication Group Digital Video Department Sharp Labs of America

Goal • Develop a better understanding of relationship between motion and intensity uncertainty models What is the best way to resolve the intensity uncertainty by exploiting the dependency in spatial and temporal domain? (Intra-inter prediction switch does not seem to be optimal) • Make wavelet really work for video What is the fundamental advantages of resolving the intensity uncertainty in the wavelet domain? (already highly successful for images, why not video)

Recent Advances • Inter-frame Wavelet Coding (MCTF+WT) EZBC WT MCTF MC-EZBC coder of RPI • Wavelet In-band Coding (WT+MCP/MCTF) LZC WT MCP/MCTF WT-MCP coder of ours

Spatial first or Temporal first? • Three reasons of doing spatial first - Difficulties with ME in the spatial domain occlusion and aperture problems - Inefficiency of applying WT to MC residues wavelet basis is not superior to DCT basis - Scalability concerns WT+MCP/MCTF offers a consistent representation at low spatial/temporal resolution

Importance of Phase • Phase carries important information of motion accuracy (Li et al. ICIP’2001) motion estimation and compensation current frame wavelet transform Phase (0,0) wavelet transform previous frame Phase (1,0) Phase shifting filter Phase (0,1) Phase (1,1)

Importance of Modeling Singularity Fact 2D image singularities are attributed to two sources, i.e. - geometric: 3D depth discontinuities (e.g. occlusion) - photometric: reflectance variation (e.g. texture) Proposition It is beneficial to separately model the two singularity sources and pursue appropriate representations. - geometric: temporal analysis (motion compensation) - photometric: spatial analysis (wavelet transform)

Advantages of ME in the Wavelet Domain • Automatically solve the occlusion problem WT can be viewed as the intra-prediction stage • Turn aperture into an advantage WT structures image information into bands with distinct orientation • Facilitate hierarchical motion estimation WT provides multi-resolution decomposition of video frames

Occlusion and Aperture occlusion aperture uncovered area LL HL LH HH Implications • More flexible motion field representation - only need to resolve the uncertainty of significant coefficients - respect geometry (to match edge/band orientation)

Hierarchical ME in the Wavelet Domain ME results at a low resolution can be propagated across the scale to aid ME at a higher resolution

Basic WT+MCP Video Coder • fixed block size motion model • half-pel motion accuracy

Experiment Results MPEG4 (256kbps,31.4dB) WT+MCP (236kbps,31.4dB)

Advanced Motion Models (I) • 3D Lifting decomposition: WT+MCTF When updating is performed with over-complete expansion, we resort to phase shifting filter (PSF) to preserve the reversibility of lifting schemes

Advanced Motion Models (II) • Towards implicit object-based motion models - Major deviation from current video coding practice Motion information is exploited to resolve the location instead of intensity uncertainty of image singularities • Motion field does not need to be explicitly coded • and therefore allows rather sophisticated models We are currently exploring a layered representation capable of modeling both camera and object motions

Fully Scalable WT+MCTF Coder low-resolution anchors • Resolution/Temporal Scalability easily achieved by 3D lifting decomposition structure • FGS capability base layer: singularity location In-band classification enhancement layer: sign/magnitude

Concluding Remarks • MCTF can be efficiently performed in the wavelet domain as long as phase is carefully considered • There exist fundamental advantages of representing video signals in the wavelet domain • New video coding paradigm exploit implicit motion model to resolve the location uncertainty of 2D singularities in the wavelet domain

Efficient Motion Field Representation in the Wavelet Domain for Video Compression

Efficient Motion Field Representation in the Wavelet Domain for Video Compression

Presentation Transcript

Wavelet-based Image Compression

Video Compression

Video Compression

Video Compression

Video Compression

SURE-LET for Orthonormal Wavelet-Domain Video Denoising

Video Compression

Motion representation

Wavelet-Domain Video Denoising Based on Reliability Measures

An Efficient Multiview Video Compression Scheme

Wavelet Transform for Image Data Compression

Wavelet Based Color Compression

Probing cosmic structure formation in the wavelet representation

Adapting Wavelet Compression to Human Motion Capture Clips

Motion-based Video Representation for Scene Change Detection

Efficient Scalable Video Compression by Scalable Motion Coding

Pixel Recovery via Minimization in the Wavelet Domain

Video Compression

Motion-Compensated Lifted Wavelet transform for video coding

Video Compression

MOTION ESTIMATION AND VIDEO COMPRESSION

The Discrete Wavelet Transform for Image Compression