530 likes | 834 Views
Domain Transformation-Based Efficient Cost Aggregation for Local Stereo Matching. Cuong Cao Pham and Jae Wook Jeon , Member, IEEE. IEEE Transactions on Circuits and Systems for Video Technology, 2012. Outline. Introduction Framework Proposed Algorithm Compute Costs
E N D
Domain Transformation-Based Efficient Cost Aggregation for Local Stereo Matching Cuong Cao Pham and Jae WookJeon, Member, IEEE IEEE Transactions on Circuits and Systems for Video Technology, 2012
Outline • Introduction • Framework • Proposed Algorithm • Compute Costs • Cost Aggregation : Domain Tramsformation • Optimization & Refinment • Experimental Results • Conclusion
[4] K.-J. Yoon and I.-S. Kweon, “Adaptive Support-Weight Approach for Correspondence Search,” IEEE Trans. Pattern Anal. Mach. Intell., vol.28, no. 4, pp. 650-656, 2006. Background • Global stereo algorithms: • High accuracy but low speed • Local stereo algorithms : • High speed but low accuracy • The key : cost aggregation • Adaptive support-weight[4] : • ‧The most well-known local method • ‧The state-of-art local algorithm • ‧Reduce the gap between global method and local method → Excessive time consumption related to support window size
Related Work • Adaptive Weight[4] • Bilateral filter • Cost-volume filtering[21] • Guided filter • Geodesic Diffusion[27] • Anisotropic diffusion → Geodesicdiffusion [21] C. Rhemann, A. Hosni, M. Bleyer, C. Rother, and M. Gelautz, “Fast Cost-Volume Filtering for Visual Correspondence and Beyond,” in Proc.IEEEIntl. Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 3017-3024,2011. [27] L. De-Maeztu, A. Villanueva, and R. Cabeza, “Near Real-Time Stereo Matching Using Geodesic Diffusion,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 2, pp. 410 - 416, 2012.
Objective • Present a cost aggregation technique: • Achieve high precision • Fast execution • Using Domain transformation • Domain transformation: • Aggregation of 2D cost data → a sequence of 1D filters • Lower computational requirements
Pixel-wise Cost Consumption • Truncated absolute difference (TAD) : • TAD of the gradient : • Final cost data: Ii(p):intensity value of the i-th color channel in the RGB color space at pixel p of the image I Tc: user-defined truncation value
Aggregation 1D Cost Data • Inspired by the domain transformation technique[14] • Dimensionality reduction technique • Defines a geodesic distance-preserving representation of a 2D image embedded in 5D (x, y, Ir, Ig, Ib) as a real line. • Aggregation of 2D cost data → a sequence of 1D filters • Reduce computational time [14] Eduardo S. L. Gastal and Manuel M. Oliveira, “Domain Transform for Edge-Aware Image and Video Processing,” ACM Trans. Graph., vol. 30, no. 4, 2011.
Aggregation 1D Cost Data • 1D discrete signal: • Cost slide Cd : • Feedback comb filter[32]: • Cd,y : input signal • Cd,y : output signal • a feedback coefficient row y a : consistent → non-edge-aware filter ‘ n-1 n [32] J. Smith, “Introduction to Digital Filters with Audio Applications,” W3K Publishing, 2007.
Aggregation 1D Cost Data • 1D discrete signal: • Cost slide Cd : • Feedback comb filter[32]: • Cd,y : input signal • Cd,y : output signal • a feedback coefficient row y a : consistent → non-edge-aware filter ‘ n-1 n
Aggregation 1D Cost Data • Two similar samples set a high value of a • Two different samples set a low value of a (Discontinue region →prevent the propagation train) • Edge-aware feedback comb filter: • g: chosen metric representing the dissimilarity between two samples • Compute g as the distance between two samples in the 1D domain (transformed from the corresponding row of the guidance image I)
Domain Transformation • I:R2→R3(a2D RGB color image) • p = (xp,yp): spatial coordinate • I(p) = (rp, gp, bp) :range coordinates • Goal: find a transform t :R2→R which preserves the original distances between points on C (given by some metric) R2 R3 g v
Domain transformation • L1 distance between two neighboring points in the original domain R2 • Distance between two corresponding samples in the new domain R • gt(x) = t (x, I(x)) : the transformation operator at point x must equal R R2
Domain transformation • Divide both sides by h and take the limit as h→0: • The value at any point u in the transformed domain: (By taking the integral of gt′ (x) from 0 to u)
Domain transformation • The value at any point u in the transformed domain: • The distance between any two points u and v in the transformed domain : (corresponds to the arc length from u to v of the signal I)
Domain transformation • The distance between any two points u and v : • We can also control the influence of spatial and intensity range information similar to the bilateral filter. • Embedding the values of σsand σr:
Domain transformation • Select the maximum absolute difference to define the distance between two points in the original domain: • The final distance g:
Domain transformation Left image Non-edge-aware filter Edge-aware filter
Aggregation 2D Cost Data • 1. Left → Right • 2. Right→Left • 3. Top → Bottom • 4. Bottom→ Top
Aggregation 2D Cost Data L→ R R→ L T→ B B→ T
Aggregation 2D Cost Data • is the 1D discrete signal plotted from each column along the y direction of the cost slide Cd • : • σH: kernel standard deviation (implicitly set to σs) • σs[10,300] andσr[0.01,0.3] can yields good results.
Aggregation 2D Cost Data ‧Algorithm:
Optimization & Refinement • Winner-take-all • Select disparities • Left-Right consistency check • Occluded regions • Weighted median filter • Noise removing
Winner-take-all • Winner-take-all(WTA) strategy: • Sd: the set of all possible disparities • Cd: Aggregated cost ‘
Left-right consistency check • The disparity maps obtained at this stage contain errors in the occluded regions. • Perform Left-right consistency check • A pixel in the left disparity map is marked as invalidated: when its value differs from the corresponding value of the pixel in the right disparity map by a value greater than one • Assign the minimum value between two closest validated pixels min validated Left image Right image invalidated
Weighted Median Filter • Using a weighted median filter to : • Remove streak-like artifacts • Remove the small amount of remaining noise • Select bilateral filter weight to compute the weighted median filter • The validated pixels are not affected by this operation.
Consistency Map vs. Final disparity Invalidated pixels
Experimental Results • Middlebury stereo evaluation • Middlebury dataset • Real-world image • Camcorder data • Execution time • CUDA implementation
Middlebury Evaluation - 1 • Adaptive Weight[4] • 3535 support window with γs= 17 and γr= 7:5 • Cost-volume filtering[21] • 19×19 support window and ε = 0:0004 • Geodesic Diffusion[27] • Iterated n = 24 times with γc = 40 and l0= 0:15 • InfoPermeable[31] • Exponential function with σ = 25 • Proposed • σs=25 and σr=0.1 Compare with the best-performing algorithm inspired by well-known edge-aware filters [31] C. Cigla and A. A. Alatan, “Efficient Edge-Preserving Stereo Matching”,in ICCV Workshop on LDRMV, 2011.
Middlebury Evaluation - 1 • Compare the performance of the raw cost aggregation • The same pixel-wise cost computation and disparity optimization steps were installed to ensure fair comparison. • Select the TAD of the color and the gradient for computing matching costs • { λ , Tc, Tg}={ 0.1, 7/255,2/255 } • Guidance image used for the aggregation stage: • Using 3x3 median filter • Reduce the high-frequncy information that is not actually useful
Experimental Results Onlynon-occluded and discontinuity regions
Middlebury Evaluation - 2 • Without refinement vs. with refinement • { λ , Tc, Tg, σs , σr }={ 0.1, 7/255,2/255, 45, 0.006 } • 3x3 median filter • Filtering Guidance image used for the aggregation stage • The weighted median filter • Used in disparity refinement stage • r = 21, γs= 81, and γr= 0.04
Experimental Results with refinement without refinement
Real-world Image • Camcorder data: • Cafe (640360, 32 possible disparities) • Newspaper (512384, 32 possible disparities) • Book_Arrival (512384, 60 possible disparities)
Execution time • Using C++ • PC with an AMD Athlon 64 X2 Dual Core 3800+ 2.00 Ghz. • Measure only the execution time of the aggregation performing on the left view • No occlusion handling or post-processing times were included.
Execution time Iteration times: n Window: 2n+1 2n+1 Window: 2n+1 2n+1 Support window size / number of iterations
Conclusion • Solve the excessive time consumption bottleneck of adaptive-weight • Integrates the appealing properties of domain transformation into the cost aggregation • Using a sequence of 1D operations • Lower computational requirements • Lower memory costs • Fast and accurate local method