340 likes | 454 Views
CSVT 2008. RD Optimized Coding for Motion Vector Predictor Selection. Guillaume Laroche, Joel Jung, Beatrice Pesquet-Popescu. In a nutshell…. For the purpose of reducing the bitrate, the paper proposes two schemes :
E N D
CSVT 2008 RD Optimized Coding for Motion Vector Predictor Selection Guillaume Laroche, Joel Jung, Beatrice Pesquet-Popescu
In a nutshell… • For the purpose of reducing the bitrate, the paper proposes two schemes: • A competition-based spatial-temporal scheme for the prediction of motion vector • Increasing the amount of skipped macroblocksvia using a competition-based SKIP mode
Outline • Introduction • MV prediction and selection • MV and SKIP mode competition • Competition-based MV coding • Competition-based Skip mode • Multiple reference frames • MV competition for B-slice • Experimental Results • Conclusion
Introduction – MV Prediction(MVP) • mvcol is the collocation of macroblock “mv” mv4 mv0 mv7 mvd mvb mvc Frame N-1 Frame N mv1 mvcol mva mv mv3 Frame N-1 Frame N mv6 mvcol mv mv5 mv2
Introduction – MV Prediction(MVP) (Cont’d) • We choose: • Motion vector residual is given by: • εmv:motion vector residual • mv:motion vector • p:motion vector predictor (MVp) B C A E C B C B A E A E
Introduction – MVP in SKIP mode • A skipped MB only has the mode itself needing to be transmitted • Most used in static background
Introduction – MVP in DIRECT mode • Two types: spatial and temporal • Spatial direct mode uses neighboring MV to predict MV • In temporal direct mode list0 and list1 predicted vectors are scaled Current B frame Ref2 Ref0 Ref1 mvcolL1 mvcolL0 dL0L2 dL0 dL0L1
Introduction – MV Selection • By minimizing the RD-criterion: • D:distortion • LR:weighted rate and the corresponding bitrate components: • Rr:the rate for block residue (luma+chroma) • Rm:the rate of the macroblock mode (SKIP or intra/inter prediction and macroblock partition type) • Rmv:the rate of the motion vector residue • Ro:the rate of the others components (header, CBP…)
Introduction – MV Selection (cont’d) • For SKIP mode, the RD-criterion becomes: where no any Ro, Rr, or Rmv is necessary to be transmitted in SKIP mode. • In practice, the cost λmRm is negligible compared with the distortion.
Competition-based MV coding • Predictor set: • Spatial predictors: mva, mvb, mvc, mvd ,H.264 median predictor mvH.264, and extended spatial predictor mvspaEXT, where if 3 vectors are available. Otherwise equal to mva, , otherwise equal to mvb, otherwise mvc, or 0 if none is available. mvd mvb mvc mv mva Frame N Ref: J. Jung and G. Laroche, “Competition-based scheme for motion vector selection and coding” ITU-T VCEG, Klagenfurt, Austria, 2006, Information VCEG-AC06
Competition-based MV coding (cont’d) • Predictor set: • Temporal predictors: mvcol, mvtf, mvtm5, mvtm9, where mvd mvb mvc mv4 mv0 mv7 mv1 mva mvcol mv3 mv Current frame mv6 Ref1 mv5 mv2 Ref0 Collocated block Current block Frame N-1 Frame N mvcol mvH.264 mvtf
Competition-based MV coding (cont’d) • Predictor set: • Spatial-temporal predictors: • It gives a higher importance to the mvcol value
Competition-based MV coding (cont’d) • Choices of MV: • Adaptive choices • Based on content or statistical criteria • No need to transmit index of the mode if decoder is able to determine the mode • Exhaustive choices • All possible predictions are tested • A mode needs to be transmitted in the bit stream • An index i and a residual εmvi are associated with each predictor : where n is the number of predictors in the defined predictor set P
Competition-based MV coding (cont’d) • For the selection of the MV, the bitrate of the motion vector residue Rmv is replaced by Rmv/mm to yield: where Rmv/mm contains the cost of the residual εmvi and the cost of the index informationi
Competition-based SKIP mode • We change the equation to • JSKIPi:RD cost • DSKIPi:distortion related to pi • where Ps is the set of motion vectors for the SKIP mode • If Skip mode is chosen, the index of the predictor is sent.
Multiplereferenceframes • Assuming an object moves with constant speed, the predictor mvcolR0 is scaled according to the temporal distances of the reference pictures used to the current block and the temporal distance between Ref0 and Refj. Current frame Refj Refi Ref0 mvcolR0 mv mvScolR0:Scaled predictor Ref0 :previous reference frame dj di
Multiplereferenceframes (cont’d) • Another predictor: the sum of temporally successive collocated vectors • Considering the all MV in each reference frame only point to their first previous frame. • In this configuration, mvScoli is scaled MV collocated in Refi pointing to Refi+1 • The sum of these successive temporal predictorsmvTsumj is defined by: • j:the reference frame number of the current predictor block
Multiplereferenceframes (cont’d) • We consider mvtfsumj, a sum of predictors derived from the predictor mvtf : • mvStfRi is the MV at the position given by mvStfRi-1in Refi-1 pointing to Refi,except mvStfR0 which is mvScol0 Ref2 Ref0 Ref1 Ref3 Current B frame mvStfR0 mvStfR1 = mvStfR2 mvStfR3
MV competition for B-slices • No modification of the Direct mode is proposed • The MV resulting from the spatial Direct mode is not considered in the set of predictors • Considering the case of N successively coded B-frames Current B frame Ref2 Ref0 Ref1 mvcolL1 mvcolL0 dL0L2 dL0 dL0L1
MV competition for B-slices (cont’d) • Vector mvcolB-1L0and mvcolB-1L1 are used for the scaling of predictors pair: , and , respectively. Current B frame Ref0 Ref1 B-1 mvcolB-1L1 mvcolB-1L0 dL0 dL0B-1 dL0L1
Experimental result • Bitrate saving on the first and second B-frame for CIF sequences • First predictor: mvH.264 • mvcolL1: MV collocated in the future frame without scaling • mvBcol= (collocated block == intra mode ? mva : mvScol L1) • mvScolL0 and mvScolL1 proves that MV field of a B-frame is more correlated with the future reference frame
Test Conditions • Two profile: Baseline profile, High profile • 32*32 search range • 8*8 transform • 4 reference frames • Test set: 9 CIF, 4 SD(640*480), and 2 720p(1280*720) sequences • QP=28, 32, 36, 40
Experimental results – Baseline Predictor sets • Predictor sets: • 11 predictors in the set P: • Percentage of the selection of each proposed predictor for MV competition for the CIF test set in the Baseline profile:
Experimental results – BaselinePredictor sets(cont’d) • Comparing P sets containing two predictors • For all CIF sequences, mvH.264 is combined one by one with each predictor. • The bitrate savings for different pairs of predictors:
Experimental results – BaselinePredictor sets(cont’d) • Selecting the optimal number of predictors in the sets • P sets of MV predictor are: • Ps sets of MV SKIP mode are:
Experimental results – BaselinePredictor selection for MV competition • Spatial and temporal predictor competition • Temporal predictors are useful • The temporal selection is correlated with the reference frame
Experimental results – BaselineSKIP mode competition • The percentage of increase of the number of macroblocks encoded with the SKIP mode For sequences with large objects and fluid motion A spatial predictor as the second predictor is less efficient for sequences with static background
Experimental results – BaselineGlobal bitrate reduction • A compression gain is acquired for all test sequences • For simple or no motion sequences, SKIP mode is widely used, so the gains are lower. • Fast or complex motion sequences take full advantage of the temporal prediction
Experimental results – BaselineGlobal bitrate reduction (cont’d) • RD curves for 4 of the test sets • At low bitrate, motion information tends to become a significant part of the total bitstream • The bitrate reduction is not related to the resolution, but related the frame rate
Experimental results – High profile • The problem is modified due to the presence of B pictures and multiple reference frames • Is the P set used for the P-frames in the Baseline profile still adapted to the High profile, where the temporal distance between P-frames is increased? • Which set is the most adapted to the B-frames, and is it the same for all the B-frames between two P-frames?
Experimental results – HighChoice of the predictor sets • The same sets as the ones proposed for the Baseline profile gives the best results • The temporal distance between two P-frames is larger, so the temporal correlation between motion vector fields is smaller • Distribution of the predictor selection in the High IBBP profile for the P- and B-frames • Bitrate saving in the high IBBP profile (only computed for CIF sequences)
Experimental result • Bitrate saving on the first and second B-frame for CIF sequences • First predictor: mvH.264 • mvcolL1: MV collocated in the future frame without scaling • mvBcol= (collocated block == intra mode ? mva : mvScol L1) • mvScolL0 and mvScolL1 proves that MV field of a B-frame is more correlated with the future reference frame
Experimental results – HighGlobal bitrate reduction • Bitrate reduction of each sequences • The gain is lower than the Baseline profile is explained by the results obtained on P-frames
Conclusion • Average bitrate reduction of Baseline and High profile are 7.7% and 4.3% respectively. • The MV predictions are selected via an RD-criterion that considers the cost of the residual and the index for the prediction. • An adaptation of predictors set according to the statistical characteristics for the sequence should allow to increase even more bitrate saving.