Stereo Video

Stereo Video Temporally Consistent Disparity Maps from Uncalibrated Stereo Videos Real-time Spatiotemporal Stereo Matching Using the Dual-Cross-Bilateral Grid Temporally Consistent Disparity and Optical Flow via Efficient Spatio-temporal Filtering Efficient Spatio-temporal Local Stereo Matching Using Information Permeability Filtering

A. Temporally Consistent Disparity Maps from Uncalibrated Stereo Videos Michael Bleyer and Margrit Gelautz International Symposium on Image and Signal Processing and Analysis (ISPA) 2009

B. Real-time Spatiotemporal Stereo Matching Using The Dual-cross-bilateral Grid Christian Richardt, Douglas Orr, Ian Davies, Antonio Criminisi, and Neil A. Dodgson1 The European Conference on Computer Vision (ECCV) 2010

C. Temporally Consistent Disparity And Optical Flow Via Efficient Spatio-temporal Filtering Asmaa Hosni, Christoph Rhemann, Michael Bleyer, and Margrit Gelautz The Pacific-Rim Symposium on Image and Video Technology (PSIVT) 2011

D. Efficient Spatio-temporal Local Stereo Matching Using Information Permeability Filtering Cuong Cao Pham, Vinh Dinh Nguyen, and Jae Wook Jeon International Conference on Image Processing (ICIP)2012

Outline • Introduction • Related Works • Methods and Results • A. Median Filter • B. Temporal DCB Grid • C. Spatial-temporal Weighted Smoothing • D. Three-pass Aggregation • Comparison • Conclusion

Introduction

Introduction • Stereo matching issues only focus on static image pairs. • The conventional methods estimate the disparities by using spatial and color information. • The important problem of extending to video is flickering. • Solution : • Base on local methods (for real-time) • Enforce temporally consistent (for flickering)

Related Works

Related Works • About Local Methods • The key of local method lies in the cost aggregation step. • Aggregate the cost data from the neighboring pixels within a finite size window. • The most well-known method is edge-preserving algorithm. • Adaptive support wight • Geodesic Diffusion • Bilateral filter • Guided filter

Related Works • Single-frame stereo matching

Related Works • Spatio-temporal stereo matching • The inter disparity difference between two successive frames is minimized to enforce the temporal consistency.

Methods and Results

A. Median filter

A. Median filter • Computing 1 disparity map takes 1 second. • But a video content about 30~60 frames per second. • => Can NOT achieve real-time. • No data and comparison.

B. Temporal DCB Grid • Bilateral Grid • It runs faster and uses less memory as σ increases. • Dual-Cross-Bilateral Grid

B. Temporal DCB Grid • Dichromatic DCB Grid • Comparison(fps) 200x

B. Temporal DCB Grid • Temporal DCB Grid • Last n = 5 frames, each weighted by wi • i=0 : current frame • i=1 : previous frame WeightedSum

B. Temporal DCB Grid 16fps 14fps

B. Temporal DCB Grid Source data

B. Temporal DCB Grid • Onlyuseintensityinformation • Justnear-real-time

C. Spatial-temporal Weighted Smoothing • Cost initialization • Construct a spatio-temporal cost volume for each disparity d. • Cost aggregation • Smooth cost volume with a spatio-temporal filter.(Guided filter [1]) • Disparity computation • Select the lowest costs as disparity(WTA) • Refinement • Wighted median filter [1]Rhemann, C., Hosni, A., Bleyer, M., Rother, C., Gelautz, M. Fast Cost-Volume Filtering for Visual Correspondence and Beyond. CVPR(2011) and PAMI (2013)

C. Spatial-temporal Weighted Smoothing

C. Spatial-temporal Weighted Smoothing • Cost initialization • Cost aggregation wk: wx * wy* wt : smoothness parameter

C. Spatial-temporal Weighted Smoothing • The guided filter weights can be implemented by a sequence of linear operations. • All summations are 3D box filters and can be computed in O(N)time.

C. Spatial-temporal Weighted Smoothing • Disparity computation : Winner take all • Refinement : Wighted Meadian filter => Just adjust to reduce single frame error.

C. Spatial-temporal Weighted Smoothing • Temporal vs. frame-by-frame processing. • 2nd row: Disparity maps computed by a frame-by-frame implementation show flickering artifacts. • 3rd row: Our proposed method exploits temporal information, thus can remove most artifacts

C. Spatial-temporal Weighted Smoothing

D. Three-pass cost aggregation • Three-pass cost aggregation technique based on information permeability(Adaptive Support-Weight).[2] [2] Yoon, K.J., Kweon, I.S.: Locally Adaptive Support-Weight Approach for Visual Correspondence Search. In: CVPR (2005)

D. Three-pass cost aggregation Frame i+1 Frame i Frame i-1

D. Three-pass cost aggregation Show the effectiveness of using temporal information in addition to spatial information . • Matching cost initialization • v = (x, y, t) represents the spatial and temporal positions of a voxel. • Similarity(weighted) function

D. Three-pass cost aggregation • Spatial Aggregation : Horizontal and then Vertical

D. Three-pass cost aggregation • Temporal Aggregation : Forward and backward • Disparity computation : WTA • Refinement • consistency check • 3 × 3 median filter.

D. Three-pass cost aggregation • Computational Complexity • Only sixmultiplications and nine additions per voxel • It is still more efficient than the adaptive support-weight approach. • Withoutmotionestimation

D. Three-pass cost aggregation

Comparison

Comparison Nopost-processing Includepost-processing:consistency checkand 3 × 3 median filter

Conclusion

Conclusion • Based on edge-preserving methods. • Extend these concepts to time dimension. • These methods only solved slow motion scenes. • They do not perform well with dynamic scenes that contain large object motions.

Stereo Video

Stereo Video

Presentation Transcript

Binocular Stereo

Stereo

Stereo

Stereo

Stereo Project

Stereo

Stereo

Stereo

Stereo

STEREO

Stereo

Stereo

Stereo

STEREO

Stereo

Stereo

Generating Seamless Stereo Mosaics from Aerial Video

Stereo Matching

Stereo

Stereo

Stereo

Stereo