150 likes | 302 Views
Efficient Motion Field Representation in the Wavelet Domain for Video Compression. Xin Li and Shawmin Lei Image Coding and Communication Group Digital Video Department Sharp Labs of America. Goal. • Develop a better understanding of relationship
E N D
Efficient Motion Field Representation in the Wavelet Domain for Video Compression Xin Li and Shawmin Lei Image Coding and Communication Group Digital Video Department Sharp Labs of America
Goal • Develop a better understanding of relationship between motion and intensity uncertainty models What is the best way to resolve the intensity uncertainty by exploiting the dependency in spatial and temporal domain? (Intra-inter prediction switch does not seem to be optimal) • Make wavelet really work for video What is the fundamental advantages of resolving the intensity uncertainty in the wavelet domain? (already highly successful for images, why not video)
Recent Advances • Inter-frame Wavelet Coding (MCTF+WT) EZBC WT MCTF MC-EZBC coder of RPI • Wavelet In-band Coding (WT+MCP/MCTF) LZC WT MCP/MCTF WT-MCP coder of ours
Spatial first or Temporal first? • Three reasons of doing spatial first - Difficulties with ME in the spatial domain occlusion and aperture problems - Inefficiency of applying WT to MC residues wavelet basis is not superior to DCT basis - Scalability concerns WT+MCP/MCTF offers a consistent representation at low spatial/temporal resolution
Importance of Phase • Phase carries important information of motion accuracy (Li et al. ICIP’2001) motion estimation and compensation current frame wavelet transform Phase (0,0) wavelet transform previous frame Phase (1,0) Phase shifting filter Phase (0,1) Phase (1,1)
Importance of Modeling Singularity Fact 2D image singularities are attributed to two sources, i.e. - geometric: 3D depth discontinuities (e.g. occlusion) - photometric: reflectance variation (e.g. texture) Proposition It is beneficial to separately model the two singularity sources and pursue appropriate representations. - geometric: temporal analysis (motion compensation) - photometric: spatial analysis (wavelet transform)
Advantages of ME in the Wavelet Domain • Automatically solve the occlusion problem WT can be viewed as the intra-prediction stage • Turn aperture into an advantage WT structures image information into bands with distinct orientation • Facilitate hierarchical motion estimation WT provides multi-resolution decomposition of video frames
Occlusion and Aperture occlusion aperture uncovered area LL HL LH HH Implications • More flexible motion field representation - only need to resolve the uncertainty of significant coefficients - respect geometry (to match edge/band orientation)
Hierarchical ME in the Wavelet Domain ME results at a low resolution can be propagated across the scale to aid ME at a higher resolution
Basic WT+MCP Video Coder • fixed block size motion model • half-pel motion accuracy
Experiment Results MPEG4 (256kbps,31.4dB) WT+MCP (236kbps,31.4dB)
Advanced Motion Models (I) • 3D Lifting decomposition: WT+MCTF When updating is performed with over-complete expansion, we resort to phase shifting filter (PSF) to preserve the reversibility of lifting schemes
Advanced Motion Models (II) • Towards implicit object-based motion models - Major deviation from current video coding practice Motion information is exploited to resolve the location instead of intensity uncertainty of image singularities • Motion field does not need to be explicitly coded • and therefore allows rather sophisticated models We are currently exploring a layered representation capable of modeling both camera and object motions
Fully Scalable WT+MCTF Coder low-resolution anchors • Resolution/Temporal Scalability easily achieved by 3D lifting decomposition structure • FGS capability base layer: singularity location In-band classification enhancement layer: sign/magnitude
Concluding Remarks • MCTF can be efficiently performed in the wavelet domain as long as phase is carefully considered • There exist fundamental advantages of representing video signals in the wavelet domain • New video coding paradigm exploit implicit motion model to resolve the location uncertainty of 2D singularities in the wavelet domain