340 likes | 486 Views
Local Disparity Estimation with Three-Moded Cross Census and Advanced Support Weight. Zucheul Lee, Student Member, IEEE, Jason Juang, and Truong Q. Nguyen, Fellow, IEEE IEEE Transactions on Multimedia, June 2013. Outline. Introduction Related Works Propsed Method
E N D
Local Disparity Estimation with Three-Moded Cross Census and Advanced Support Weight Zucheul Lee, Student Member, IEEE, Jason Juang, and Truong Q. Nguyen, Fellow, IEEE IEEE Transactions on Multimedia, June 2013
Outline • Introduction • Related Works • Propsed Method • Experimeantal Results • Conclusion • Our Method Improvement
Introduction • Propose an efficient one-pass local method applicable to both stereo images and videos with no iteration. • Three-moded cross census transform with a noise buffer • Advanced support weight • Using motion information to impose temporal consistency.
Related Works • Main concerns in local methods • Accuracy of similarity measure • Proper support window • Similarity measure • SAD, SSD, NCC • Rank, Census • Support window • Fix window shape: Adaptive window, Multiple-window, LASW • Arbitrary window shape: Segment-support, Disparity calibration, Patch match
Related Works • Problem of stereo matching on videos • Lack of datasets • Temporal inconsistency • Measure • Median filter with optical flow • TV method[12] [12] R. Khoshabeh, S. H. Chan, and T. Q. Nguyen, “Spatio-Temporal Consistency in Video Disparity Estimation,” ICASSP, pp. 885-888,2011.
Propsed Method Similarity Measure Optical Flow Conditional Support Correlated Support Disparity Computation Occlusion Filling TV Refinement
Propsed Method Similarity Measure
Three-moded Cross Census • Bigger window, more information. • Trade-off between spatial information and accurate risk. • (a)5x5 V.S. (b)9x9 • Propose the cross-square shape (c), which can contain more spatial information but less exposed to occlusion area. Blue: three different census windows. (a) 5x5. (b) 9x9. (c) Cross-square. Gray: occlusion area. Red: center.
Three-moded Cross Census (b) Original census with noise (c) Three-moded census with noise (a) Original census without noise • , , [14][15] • Get more flexibility in flat areas.
Cost initialization • Hamming distance: • Color distance: • Gradient distance: • Combination:
Cost initialization • Disparity maps on ”Laundry” computed by different similarity measures. (a) Left image (c) Color +Census (d) C+C+Gradient (b) Color
Propsed Method Optical Flow Conditional Support Correlated Support
Cost Aggregation • Gestalt Grouping [16] • Proximity • Similarity • Common face • Support weight consideration • Color space • Relationship between proximity and similarity • Motion term addition for videos [16] D. Angens, “From Gestalt theory to image analysis : a Probabilistic approach,” NY : Springer, 2008.
Color Space • RGB V.S. CIELab • RGB produces more selective distance than CIELab in similar color region. • CIELab are particularly sensitive to errors in low RGB signal [18]. RGB (a) Support window. (b) RGB color difference. (c) CIELab color difference. [18] C. Connolly and T. Fleiss, “A study of efficiency and accuracy in the transformation from RGB to CIELAB color space,” IEEE trans. Image Processing, 1997.
Support Weight on Images • Conditional support (a) Left image. (b) Original support. (c) Conditional support.
Support Weight on Videos • Correlated support
Cost Aggregation • Cost Aggregation: conditional/correlated support weight [4] • Disparity Computation: winner take all [4] Kuk-Jin Yoon and In So Kweon, “Adaptive support weight approach for correspondence search,” IEEE Transactions on PAMI, vol. 28, Apr. 2006.
Propsed Method Disparity Computation Occlusion Filling TV Refinement
Occlusion Filling • Find first non-occluded pixel (x+s,y) of left neighborhood • Construct the arbitrarily support region (yellow) [22] • Only non-occluded pixels vote [Left image] Black: background; White: occlusion; Gray: foreground. [22] Ke Zhang, Jiangbo Lu, and Gauthier Lafruit, “Cross-based local stereo matching using orthogonal integral images.,” IEEE Trans. Circuits and Systems for Video Technology, 2009.
TV Refinement [12] • : piecewise smooth • : total variation norm • Solve: • D : forward difference operator • Add (βx, βy, βt) to control the relative emphasis => [12] R. Khoshabeh, S. H. Chan, and T. Q. Nguyen, “Spatio-Temporal Consistency in Video Disparity Estimation,” ICASSP, pp. 885-888,2011. [13] Z. Lee, R. Khoshabeh, J. Juang, and T. Q. Nguyen, “Local Stereo Matching using Motion Cue and Modified Census in Video Disparity Estimation,” Signal Processing Conference (EUSIPCO), Aug. 2012. http://www.camdemy.com/media/6148
Experimeantal Results • Comparison of original and cross-square census window performance.
Experimeantal Results • Comparison of the original census and the three-moded census with a noise bufferon ”Computer”. Original census. Three-moded census. Original census on noise added image. Three-moded census on noise added image. Left image.
Left frames (b) LASW (c) Cost-filter (d) Proposed method (e) After occlusion filling (f) After TV [12]
Conclusion • Using census with a noise buffer and cross-square to increase accuracy and robustness to image noise in flat. • Using conditional and correlated support model to obtain more precise support weight. • The proposed method is NOT sensitive to the parameter values • The proposed method outperforms the other local methods on both stereo images and videos.
Our Method Improvement • Weighted parameter in aggregation step , Full-image Guided Filter [*] , [*] Qingqing Yang, Dongxiao Li, Lianghao Wang, and Ming Zhang, “Full-Image Guided Filtering for Fast Stereo Matching”, Signal Processing Letters, IEEEMarch 2013
Result After Refinement New Aggregation Original Aggregation
TV Refinement • Solve: [*]S. H. Chan, R. Khoshabeh, K. B. Gibson, P. E. Gill, and T. Q. Nguyen, “An augmented lagrangian method for total variation video restoration,” in ICASSP, May 2011