550 likes | 772 Views
On Building an Accurate Stereo Matching System on Graphics Hardware. Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops, 2011 IEEE. Outline. Introduction Related Works
E N D
On Building an Accurate Stereo Matching System on Graphics Hardware Xing Mei ; Xun Sun ; Mingcai Zhou ; Shaohui Jiao ; Haitao Wang ; Xiaopeng Zhang Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops, 2011 IEEE
Outline • Introduction • Related Works • Algorithmn • CUDA Implementation • Experimental Results • Conclusion
Introduction Dense two-frame stereo matching • Compute a disparity map from stereo images. • Broad applications: 3D reconstruction, view interpolation
Related Works • Local methods • Compute each pixel’s disparity independently over a local support region. • Fastbutinaccurate. • Global methods • Solve the stereo problem in an energy minimization process. • Accuratebutslowdue to time-comsuming global optimizer.(GC,BP)
Related Works • Propagation-based methods • Produce quasi-dense or dense disparity results from a set of seed pixels. • Relatively fast but sensitive to early wrong matches • use segmented regions as guided propagation unit • expensivecost
Related Works • Introduce a simple guided unit for propagation : pixel-wise 1D line segments. • No image segmentation required here. • Simple, fast and accurate
Algorithmn • Framework • Input: • Stereoimages Output: Disparity map
Algorithmn • Input: • Stereoimages Output: Disparity map
Disparity Cost Computing • Cost mesure : AD, BT, gradient-based measures, non-parametric transforms(rank/census[3])...... • Combination : SAD+gradient[6],AD + Census • AD (Absolute Distance) • Constant color assumption • Repetitive structures • Census • Encodes local image structures • Textureless regions [3] H. Hirschmuller and D. Scharstein. “Evaluation of stereo matching costs on images with radiometric differences.”IEEE TPAMI, 31(9):2009. [6] A. Klaus, M. Sormann, and K. Karner. “Segment-based stereo matching using belief propagation and a self-adapting dissimilarity measure.” ICPR,2006.
AD-Census Cost Initialization + • p : pixel • d : level • >> a robust function on variable 𝑐 • pd = (x-d,y) in the right image • : Hamming distance[22] d Left I Right I [22] R. Zabih and J. Woodfill. “Non-parametric local transforms for computing visual correspondence.” In Proc. ECCV, 1994.
Census Transform Census transform window :
Census Hamming Distance • Left image • Right image Hamming Distance = 3 XOR
AD-Census Cost Initialization + • > >> a robust function on variable 𝑐
AD-Census Cost Initialization • AD-Census measure produces proper disparity results for both repetitive structures and textureless regions.
Algorithmn • Input: • Stereoimages Output: Disparity map
Cross-based Cost Aggregation[23] • Cross construction • Line ending points P1, P2 for P are located when rule 1 or 2 are violated: • R1: Color self-similarity in the line region: smooth depth assumption • R2: Arm length limitation: avoid over-smoothness [23] K. Zhang, J. Lu, and G. Lafruit. “Cross-based local stereo matching using orthogonal integral images.” IEEE TCSVT,2009.
Cross-based Cost Aggregation • Enhancecross construction (use pixel p’s left arm and the endpointpixel pl as an example)
Cross-based Cost Aggregation • Cost aggregation • Run this step for 4 iterations to get stablecost values. • For iteration 1 and 3, aggregated horizontally and thenvertically. • For iteration 2 and 4, aggregated verticallyand then horizontally. • Reduce the errors at depth discontinuities.
Cross-based Cost Aggregation • Our aggregation method can better handle large textureless regions and depth discontinuities.
Cross-based Cost Aggregation [21] K.-J. Yoon and I.-S. Kweon. “Adaptive support-weight approach for correspondence search.” IEEE TPAMI, 2006. [23] K. Zhang, J. Lu, and G. Lafruit. “Cross-based local stereo matching using orthogonal integral images.” IEEE TCSVT,2009.
Algorithmn • Input: • Stereoimages Output: Disparity map
ScanlineOptimization[2] • 4 scanline optimization processes are performed independently. • 2 horizontal directions • 2 vertical directions [2] H. Hirschmuller. Stereo processing by semiglobal matching and mutual information.” IEEETPAMI, 2008.
Scanline Optimization p p-r r • r : direction • p-r : the previous pixel along the same direction • 𝑃1, 𝑃2: penalize the disparity changes between neighboring pixels. (𝑃1 ≤𝑃2) [8] [8]S. Mattoccia, F. Tombari, and L. D. Stefano. “Stereo vision enabling precise border localization within a scanline optimization framework.” In Proc. ACCV, pages 517–527, 2007.
Scanline Optimization • The final cost : • The disparity with the minimum 𝐶2value is selected as pixel p’s intermediate result.
Algorithmn • Input: • Stereoimages Output: Disparity map
Multi-step Disparity Refinement • Outlier Handling • Outlier Detection • Iterative Region Voting • Proper Interpolation • Depth Discontinuity Adjustment • Sub-pixel Enhancement
Outlier Handling--Detection • The outliers:𝐷𝐿(p) != 𝐷R(p − (𝐷𝐿(p), 0)) • Outliers are further classified into occlusion and mismatch points • p intersect its epipolar line and𝐷Ris checked • If no intersection p is labelled as “occlusion”, otherwise “mismatch”
Outlier Handling--Iterative Region Voting • Construct cross-based regions and a robust voting scheme • Sp : • 𝜏𝑆, 𝜏𝐻 : threshold values • 5 iterations d d
Outlier Handling--Proper Interpolation • occlusion • The pixel with the lowest disparity value is selected for interpolation • It’s most likely comes from the background • mismatch points • The pixel with the most similar color is selected for interpolation.
Depth Discontinuity Adjustment • For each pixel p on the disparity edge, two pixels p1, p2 from both sides of the edge are collected. • 𝐷𝐿(p) is replaced by 𝐷𝐿(p1) or 𝐷𝐿(p2) if one of the two pixels has smaller matching cost than 𝐶2(p,𝐷𝐿(p)). 𝐷𝐿(P1) 𝐷𝐿(P) 𝐷𝐿(P2)
Sub-pixel Enhancement[20] • Quadratic polynomial interpolation • With 3*3 median filter [20] Q. Yang, L. Wang, R. Yang, H. Stewenius, and D. Nister. “Stereo matching with color-weighted correlation, hierarchical belief propagation andocclusion handling.” IEEE TPAMI, 2009.
Multi-step Disparity Refinement • The average error percentages after performing each refinement step.
CUDA Implementation • Compute Unified Device Architecture (CUDA) is a programming interface for parallel computation tasks on NVIDIA graphics hardware. • The computation task is coded into a kernelfunction. • The allocation of the threads is controlled with two hierarchical concepts: grid andblock. • Akernelcreates a grid with multiple blocks, and each block consists of multiple threads.
CUDA Implementation • Cost Initialization: • Parallelize with 𝑊 × 𝐻 threads. • Organize into a 2D grid and the block size is set to 32× 32. • Each thread computes a cost value for a pixel at a given disparity. • Forcensus transform, a square window is require for each pixel, which requires loading more data into the shared memory for fast access.
CUDA Implementation • Cross-based Cost Aggregation: • A grid with 𝑊 × 𝐻 threads. • Cross construction:block size is 𝑊 or 𝐻 toefficiently handle a scanline • Cost aggregation:block size is 32X32 • Data reuse with shared memory is considered in both steps.
CUDA Implementation • Scanline Optimization: • This step is different,because the process is sequential in the scanline direction and parallel in the orthogonal direction. • 𝑊 × 𝐷 or 𝐻 × 𝐷 threads • Disparity Refinement: • 𝑊 × 𝐻 threads
Experimental Results • Device:A PC with Core 2 Duo 2.20GHz CPU and NVIDIA GeForce GTX 480 graphics card • Settingsparameters: • Source : Middlebury http://vision.middlebury.edu/stereo/ HHI database(book arrival) Microsofy i2i database(Ilkay)
Experimental Results • The GPU-friendly system brings an impressive 140× speedup. • The average proportions of the GPU running time for the four computation steps are 1%,70%,28% and 1% respectively. • The iterative cost aggregation step and the scanline optimization process dominate the running time.
Experimental Results • First row: disparity maps generated with our system. • Second row: disparity error maps with threshold 1. • Errors in unoccluded and occluded regions are marked in black and gray respectively.
Experimental Results • video
Experimental Results Snapshots on ’book arrival’ stereo video
Experimental Results Snapshots on ’Ilkay’ stereo video