250 likes | 469 Views
Block Matching using Fast Walsh Search. ELE5430 Pattern Recognition W.K. Cham Professor Department of Electronic Engineering The Chinese University of Hong Kong. Block Matching using Fast Walsh Search ( 2 ).
E N D
Block Matching using Fast Walsh Search ELE5430 Pattern Recognition W.K. Cham Professor Department of Electronic Engineering The Chinese University of Hong Kong
Block Matching using Fast Walsh Search (2) • Introduction (revision of video coding & motion estimation using H.261 as example) • Block Matching • Fast Walsh Search • Experimental Results
Fast Walsh Search(3) • Like FSBM, we wish to search through all candidates within the search area in reference frame to avoid being trapped in local minimum • Computation requirement should be similar to the fast search algorithms, such as three-step search[1] and diamond search [2]. [1] T. Koga; K. Iinuma; A. Hirano; Y. Iijima; T. Ishiguro; “Motion compensated interframe coding for video conferencing,” in Proc. Nat. Telecommun. Conf., New Orleans, LA, Nov. 1981, pp. G5.3.1-5.3.5. [2] Shan Zhu; Kai-Kuang Ma; “A new diamond search algorithm for fast block-matching motion estimation,” IEEE Transactions on Image Processing, Vol. 9, No. 2, Feb. 2000, pp. 287-290.
Projection of candidates onto WT domain Eliminate mismatch candidates using the WT projections Search for the best match Candidate with SADDCC Start Projection of target block onto WT domain End Flowchart of the Fast Walsh Search(4) Fast Walsh Search:Searches all candidates but perform comparison in the transform domain. Y. Hel-Or; H. Hel-Or; “Real time pattern matching using projection kernels”, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 27, No. 9, Sept 2005, pp. 1430 – 1445. WT: Walsh Transform
Projection of candidates onto WT domain Eliminate mismatch candidates using the WT projections Search for the best match Candidate with SADDCC Start Projection of target block onto WT domain End Why transform?(5) • Information of a block of pixels can be represented by a few coefficients • Use a few coefficients to find the distance instead of all pixels in the block
Projection of candidates onto WT domain Eliminate mismatch candidates using the WT projections Search for the best match Candidate with SADDCC Start Projection of target block onto WT domain End Why Walsh Transform?(6) • High information packing ability • Compress energy of highly correlated signals (e.g. image blocks) into a few transform coefficients • Fast computation • Involve INTEGER additions and subtractions only • A fast pruning algorithm that fully exploits previous intermediate steps to allow efficient WT
T(x,y) In block matching, we are to search inside the search area in the reference frame to locate the macroblock that is closest to the current macroblock in the target frame. Sum-of-absolute difference (SAD)(7) difference MB D(x+m,y+n) C(x+m,y+n) motion vector Let T(x,y)and C(x+m,y+n) be the NN square matrices representing the pixel values of the current block at (x,y) in target frame and one of the candidate blocks located at (x+m,y+n) in reference frame. Let D(x+m,y+n) = T(x,y)– C(x+m,y+n).
difference MB motion vector Sum-of-absolute difference (SAD) is the most commonly used matching error in block matching, which is equal to T(x,y) D(x+m,y+n) C(x+m,y+n) The computation of SAD can be reduced using Partial SAD (PSAD) which is an approximation of SAD. D(x+m,y+n)= T(x,y)– C(x+m,y+n) We apply 2D WT on D(x+m,y+n) to form cD.
Approximating SAD using PSAD (9) As D(x+m,y+n) = T(x,y)– C(x+m,y+n) where cD(u,v), cT(u,v) and cC(u,v) are the (u,v)th WT coefficients of D(x+m,y+n), T(x,y)andC(x+m,y+n).
The (u,v)th DCT coefficients(10) v 0 1 2 3 4 5 6 7 Wv = W2 u 0 1 2 3 4 5 6 7 Wu = W1
Define PSAD, dpsa(q), as the sum of magnitudes of q WT coefficients of D(x+m,y+n) , i.e. • whereSq is a set of q indices corresponding to basic pictures where the indices are selected in a zig-zag path similar to JPEG standard • Note that the 1/N2 is part of the equation in the inverse 2D Walsh transform that converts transform domain cD(u,v) into pixel domain dpas(q).
T(x,y) D(x+m,y+n) difference MB b(u,v+1) C(x+m,y+n) motion vector d(x,m,y,n) b(u+1,v) b(u,v) Alternative View (12) • Arrange matrices into lexicographical order: • D(x,m,y,n) d(x,m,y,n) • Wu WvT b(u,v) Projectingd(x,m,y,n)ontoqBPswhose indices are inSq, the dpsa(q)is defined as:
b(u,v+1) d(x,m,y,n) b(u+1,v) b(u,v) • dpsa(q)is monotonically increasing and approach to das q increases. • The use of dpsa(q)as matching error allows each candidate in reference frame to be checked and thus avoids being trapped in local minimum. • Experimental result shows that q=1 can provide sufficient accuracy for N=8.
Fast Projection Scheme(14) WH tree for 1D pattern of size 8
Early Elimination of Mismatch Candidates(15) Projection of candidates onto WH domain Eliminate mismatch candidates using the WH projections Search for the best match Candidate with SADDCC Start Projection of target block onto WH domain End If the PSAD of a candidate in reference frame is greater than a threshold Tpsa, it will be rejected from further consideration.
Dilemma in threshold decision: • If Tpsa is too large, most of candidates still remain in the next iteration considerable computation • If Tpsa is too small, the candidate with the least SAD may be rejected large matching error Observation Tpsa is too large for smooth blocks while too small for high activity blocks.
“high activity” block smooth block Two-level Threshold Scheme(17) • Divide target block into two classes: • smooth target block • with few intensity changes • require a smallTpsa • high activity target block • with salient feature • require a largeTpsa
Classification of blocks into different classes (18) Define activity levelLa(M) of each target block as where SM is a set that contains the first M(m,n) along the zigzag path excluding (0,0). Block is smooth if La(M) < Tf and is high activity if otherwise. Experimental results show that good results can be obtained with M = 3, Tf=300, Tpsas = 2 and Tpsah= 30.
: smooth Examples of Smooth Blocks (19)
Projection of candidates onto WH domain Eliminate mismatch candidates using the WH projections Search for the best match Candidate with SADDCC Start Projection of target block onto WH domain End Block Matching Using SADDCC (20)
Their PSADs for a small number of BPs may be very similar !!!! Inefficient to approximate SAD using PSAD when the number of BPs grows
N N target block One of the remaining candidates N N reference frame Decompose each block into k2 equal size and non-overlapping sub-blocks where k=2m and mZ+
k2subblocks • Divide target block and remaining candidate blocks into k×k subblocks of size Nk×Nk, where Nk = N/k • E.g.: N=8, k=4 • SADDCC, ddcc, defined as the sum of absolute differences of DC coefficients of corresponding subblocks
a remaining candidate reference block This can be viewed as finding the SAD of the subsampled blocks.
k=4 Sq ={(u,v) | 0u,vk-1} It can be proved that dapsa(k) is a closer distance to SAD d compare to PSAD dpsa(q) for q=k2 and Sq={(u,v) | 0u,vk-1}.