520 likes | 830 Views
Video Frames Interpolation Using Adaptive Warping. Ying Chen Lou Major Advisor: M.J.T. Smith Co-advisor: Edward Delp Nov. 15, 2010. Outline. Background Generic motion model Video spatial interpolation Video compression Video frame rate up-conversion Summary and future work. Motivation.
E N D
Video Frames Interpolation Using Adaptive Warping Ying Chen Lou Major Advisor: M.J.T. Smith Co-advisor: Edward Delp Nov. 15, 2010
Outline • Background • Generic motion model • Video spatial interpolation • Video compression • Video frame rate up-conversion • Summary and future work
Motivation • Spatial interpolation • Conversion from SDTV to HDTV • Zooming of region of interest (ROI) • Surveillance/forensics • Medical imaging • Satellite imaging • Temporal interpolation (Frame rate up-conversion) • 3:2 pull down 24Hz -> 30Hz • Avoid flicker and blurring on LCD • Loss of frames in transmission Frame Rate Up-Converter
Challenges and Goal • Core: motion estimation • To derive a generic motion model which can be used for different applications • Motion needs to be accurate • Ill-posed problem (aperture problem) • Suitable for different types of motion • Translational (panning) • Zoom in/out • Rotational • Low to moderate computational intensive Local window used for ME
Block Matching Optical Flow Illustrative ExampleBlock Matching vs. Optical Flow Block Matching Optical Flow Motion
Motion Estimation Method (Warping) I[n1, n2, k] = I[n1 + d1[n1, n2, k], n2 + d2[n1, n2, k], k + δk] OFE where
Warping Method (cont’d) • Assumes that the pixel displacement functions within some region R of an image I[n1,n2,k] can be written as: • and are vectors composed of the displacement parameters to be estimated • The bilinear displacement parameter are computed by minimizing the mean squared error function (MSEF)
Quad-tree Divisions Quad-tree Uniform Makes the algorithm adaptive and more efficient
Application I Spatial Interpolation Motivation • Characteristics of lecture videos • Large static backgrounds • Little to medium motion of foreground • Possible to store several high resolution frames and retrieve them later • Simplest scenario • periodically transmit one high resolution frame and the remaining frames are in the form of low resolution • Extend the proposed method to other types of video
Proposed method I: Full-band Warping (FWWA) • Interpolation • Bilinear • Adaptive Synthesis • Filter Banks • Warping • Block matching-based and optical flow-based • Quadtree splitting
Robustness Issue • Address the robustness issue • Obtain reliable motion vectors • Maintain the sharpness • Challenges • No corresponding pixels in the reference frame • Ambiguity of improvement in sharpness and distortion objectively
Proposed method IIThe Composite Algorithm • Incorporate advanced spatial interpolation algorithms • Bidirectional warping • Forward and backward warping • To solve “no corresponding pixels in the reference frame” • Hierarchical motion structure • To solve “ambiguity of improvement in sharpness and geometric distortion ”
Hierarchical Motion Structure ZF3 – forward warped frame ZF3’ – downsampled forward warped frame ZB3 – backward warped frame ZB3’ – downsampled backward warped frame X3 – original MxN low resolution frame
Experiment Setup • Assessment of the composite algorithm • Compare with bilinear, bicubic, NEDI, VA, and the full-band warping algorithm (FWWA) • Investigate the impact of different spatial interpolation methods on the composite warping algorithm • Incorporate different spatial interpolation methods in the first step • The remaining stages are the same • 3 sets of video sequences, CIF, 50 frames • Low motion (‘talking head’ lecture video ) • High detail • High motion
Assessment of the Composite Algorithm • Every 5th frame as a high resolution reference frame • Decimate frames to get low resolution frames • Use a 21-tap lowpass filter • Downsample factor 2 • Use original frames as ground truth • Competing methods • Bilinear, Bicubic • simple image interpolation methods • New-edge Directed Interpolation (NEDI) • an advanced image interpolation method • A super resolution method proposed by Patrick Vandewalle (VA) 1 • Full band warping (FWWA) [1]P. Vandewalle, S. Susstrunk, and M. Vetterli, “A frequency domain approach to registration of aliased images with application to superresolution”, EURASIP Journal on Applied Signal Processing, 2006
Talking Head Video Original frame Bilinear interpolation Comp FWWA
High Detail Video Original frame Bilinear interpolation Comp FWWA
High Motion Video Original frame Bilinear interpolation Comp FWWA
(a) (b) (c) • (c) (d) • (a) Bilinear (b) NEDI (c) VA (d) Bicubic + Comp
Conclusion • The composite methods achieved good spatial interpolation results • Accommodate complex motion • Outperform competing methods subjectively and objectively • Improvement comes mostly from the warping process • A combination of bicubic interpolation and warping results in best overall performance • Complexity not too high • Subjective and objective results are satisfactory • Particularly perform well for lecture videos and high detail videos
Application II Video Compression H.264/AVC • Retain edges but remove texture at low bitrates
Goal and Proposed Method • Goal • Propose a coding method which keeps the high frequency components • Achieve as high visual quality as possible • Maintain integrating of H.264/AVC coder which is well engineered • Be robust • Three assumptions • Smaller resolution requires fewer bits • Sequences with low motion don’t need full resolution coding • Key frames more bits • Proposed method • Adaptive warping • Spatio-temporal
Overall System encoder decoder
Experiment Setup • New algorithm compared against H.264 in the following setup • H.264 codec • Every Nth frame as an I frame, the rest coded as P or B frames • Proposed method • Every Nth frame used as the reference frame • Other frames are decimated (LL subband) and coded • Total bit rate is sum of full resolution reference frame and the quarter resolution LL subbands • Bit rates are the same in both cases
Subjective Result (a) H.264 (b) proposed method 3rd frame for Salesman sequence @ 80kbits/s
Conclusion • The proposed method achieve better visual quality at low bitrates • The gap decreases as the bitrates increase • At high bitrates, H.264 has more bits to spend on high frequency components and thus achieves better quality • The nature of the proposed method works better for sequences with more details • Room remained to be improved • Explore tradeoffs in spatio-temporal decimation rates • More frequently for video with large motion and less often for video with small motion • For long lecture video, we can choose full coverage of reference frame and no anymore later
Application III Video Frame Rate Up-Conversion • Overview of FRUC • No motion vectors • Frame repetition, frame average • Use motion vectors • Use motion vectors from the decoder directly • Advantage: Low complexity • Disadvantage: Not true motion • Perform motion estimation again • Advantage: true motion • Disadvantage: High computational complexity Frame Rate Up-Converter
Goal and Proposed Method • Goal (1) True motion vectors (2) Relative low complexity • Challenges (1) Highly accurate MVs (2) Low percentage of MV re-estimation (3) Occlusion (4) Blocking artifacts • Approach • Decoded video sequences from the decoder • Additional information from the decoder • System diagram Previous Reconstructed frame X1 Interpolated Frame MV Reliability Check MV Re-estimation Motion Compensated Interpolation Small Block Merging Current residual frame, reconstructed frame X2, and its MVs
MV reliability check • Categoried into 3 groups
Small Block Merging • Avoid broken edges and want to maintain object structure
MV Re-estimation • Key in the system • Accurate MVs are required for FRUC • Low complexity • A combination of Optical Flow-based and Block Matching-based motion estimation (Warping method)
Motion Compensated Interpolation N=2, k=1
Occlusion • Uni-directional interpolation N=2, k=1
Overlapped Boundary Motion Compensation (OBMC) • Goal: • To reduce the blocking artifacts. • Selectively perform OBMC to reduce the computational complexity • (BAD > T)
Experiment Setup • JM 11.0 • GOP: IPP…P, 15th I frame, fixed QP • Code odd frame and skip every other frame • 15fps • Transform 8x8=2 mode on • Search range = 16 • Standard CIF Video sequences used: • Akiyo, News, Salesman, Foreman, Carphone • Flower Garden, Tempete • Football, Table Tennis • Test against DMCFI, correlation-based motion selection1 [1] Ai-Mei Huang and T. Nguyen, “Correlation-based motion vector processing for motion compensated interpolation”, ICIP 2008
Visual Result (1) 384kb/s (b) DMCFI 20.54dB (a) Orig (c ) Correlation-based 20.48dB (d) Proposed 24.19dB
Visual Result (2) 512kb/s (b) DMCFI 20.01dB (a) Orig (c ) Correlation-based 20.17dB (d) Proposed 20.13dB
Conclusion • Proposed a FRUC method that combines optical flow and block matching-based motion estimation • Reduced computational complexity • Reduced blocking artifacts • Achieve better visual quality for low motion video sequences and perform on par with other methods for high motion video sequences
Summary • Provide a generic framework to achieve • Spatial enlargement of video frames • Video compression • Frame rate up-conversion • Achiever higher objective and subjective results • Improve the robustness by using FW, BW and hierarchical motion structure
Future Work • Continue to refine the model • Apply to higher resolution video • Incorporate Subjective Video Quality Analysis • Reference frame recycling • Adaptively select the position of high quality reference frame
Application I Spatial InterpolationRelated Work • Frame restoration • Frame interpolation • Bilinear, bicubic, spline, … • Adaptive Synthesis Filter Banks • New edge-directed interpolation • Superresolution (SR)