330 likes | 438 Views
Depth Enhancement Technique by Sensor Fusion: Joint Bilateral F ilter Approaches. Speaker Min-Koo Kang. November 14, 2012. Outline. Introduction Why depth data is important How to acquire depth data D epth upsampling: state-of-the-art approach Background
E N D
Depth Enhancement Technique by Sensor Fusion: Joint Bilateral Filter Approaches Speaker Min-Koo Kang November 14, 2012
Outline • Introduction • Why depth data is important • How to acquire depth data • Depth upsampling: state-of-the-art approach • Background • Interpolation filters: Nearest Neighbor / Bilinear / Bicubic / Bilateral • Bilateral filter-based depth upsampling • Joint Bilateral Upsampling (JBU) filter / SIGGRAPH 2007 • Pixel Weighted Average Strategy (PWAS) / ICIP 2010 • Unified Multi-Lateral (UML) filter / AVSS 2011 • Generalized depth enhancement framework / ECCV 2012 • Concluding remarks
Introduction • Why depth data is important • How to acquire depth data • State-of-the-art approaches
Why is depth data important? • Used in various fields • One of the most important techniques in computer vision • Important factors • speed, accuracy, resolution Human computer interaction 3D reconstruction Virtual view generation In 3DTV
How to acquire depth data? • Depth acquisition method comparison • Range sensor method has the most appropriate performance except low-resolution Can be overcome by depth map up-sampling Laser scanning method Stereo vision sensor Range sensor
Problem definition • Disparity estimation by range sensor delivers small resolution of depth map • Rendering requires full resolution depth map • Main objectives / requirements: • Cost-effective (potential for real-time at consumer electronics platforms) • Align depth map edge with image edge • Remove inaccuracies (caused by heuristics in disparity estimation) • Temporal stability (esp. at edges and areas with detail) Upsampling Refinement
Depth upsampling • Definition • Conversion of depth map with low resolution into one with high resolution • Approach • Most state-of-the-art methods are based on sensor fusion technique; i.e., use image sensor and range sensor together Depth map up-sampling by using bi-cubic interpolation Depth map up-sampling by using image and range sensor
Background • Interpolation filters: Nearest Neighbor / Bilinear / Bicubic / Bilateral
Single Image-based Interpolation • The conventional filterings
Upsampling examples Input Nearest Neighbor Bilinear Bicubic 0% Sharpening 16.7% Sharpening • The main types of artifacts are most easily • seen at sharp edges, and include aliasing (jagged edges), blurring, and edge halos (see illustration below) 25% Sharpening
Single Image-based Interpolation • Bilateral filtering: smoothing an image without blurring its edges
Bilateral filtering applications Bilateral smoothing Input Gaussian smoothing better denoising by bilateral filter noisy image naïve denoising by Gaussian filter
Bilateral filter-based depth upsampling • Joint Bilateral Upsampling (JBU) filter / SIGGRAPH 2007 • Pixel Weighted Average Strategy (PWAS) / ICIP 2010 • Unified Multi-Lateral (UML) filter / AVSS 2011 • Generalized depth enhancement framework / ECCV 2012
Joint bilateral filtering • Multi-modal filtering • Range term defined by one modality • Filtering performed on an other modality • Propagates properties from one to an other modality • Edge preserving properties
Joint bilateral upsampling (JBU) • First publication on bilateral filters for upsampling at SIGGRAPH 2007 • J. Kopf, Univ. of Konstantz (Germany) provided reference sw. • [Kopf2007] solution: • High resolution image in range term • Low resolution input high resolution output Kopf et al., “Joint Bilateral Upsampling”, SIGGRAPH2007
Joint bilateral upsampling (JBU) • Representative formulation: • N(P): targetingpixel P(i, j)’sneighborhood. fS(.): spatial weighting term, applied for pixel position P. fI(.): range weighting term, applied for pixel value I(q). fS(.), fI(.) areGaussian functionswith standard deviations, σSand σI, respectively. Upsampled depth map Rendered 3D view Kopf et al., “Joint Bilateral Upsampling”, SIGGRAPH2007
Is JBUideal enough? • Limitations of JBU: • It starts from the fundamental heuristic assumptions about the relationship between depth and intensity data • Sometimes depth has no corresponding edges in the 2-D image • Remaining problems: • Erroneous copying of 2-D texture into actually smooth geometries within the depth map • Unwanted artifact known as edge blurring High-resolution guidance image (red=non-visible depth discontinuities) Low-resolution depth map (red=zooming area) JBU enhanced depth map (zoomed)
Pixel Weighted Average Strategy (PWAS) • Pixel Weighted Average Strategy for Depth Sensor Data Fusion • F. Garcia, proposed in ICIP 2010 • [Garcia2010] solution: • Use of a *credibility map to cope with texture copy & edge blurring • Credibility map indicates unreliable regions in depth map • Representative formulation: • D: given depth map. Q: credibility map. Guiding intensity image. *credibility: 믿을 수 있음, 진실성; 신용, 신뢰성, 신빙성, 위신 Garcia et al., “Pixel Weighted Average Strategy for Depth Sensor Data Fusion”, ICIP 2010
High resolution image Low resolution depth JBU result PWAS result
Again, is PWAS ideal enough? • Limitations of PWAS: • Degree of smoothing depends on gradient of low resolution depth map • Remaining problems: • Degree of smoothing depends on gradients of pixels in depth map • Erroneous depths around depth edge are not compensated well • Contradictive with spatial weight term (fS(.)) • Texture copy issue still remains in homogeneous regions of depth map High-resolution guidance image (red=non-visible depth discontinuities) JBU enhanced depth map (zoomed) PWAS enhanced depth map (zoomed)
Unified Multi-Lateral (UML) filter • In order to reduce texture copy issue, the same author proposed combined version of two PWAS • F. Garcia, proposed in AVSS 2011 • [Garcia2011] solution: • Use of combined PWAS filters • The second filter has both spatial and range kernels acting onto D • Use of the credibility map Q as a blending function, i.e., β = Q • Representative formulation: • Depth pixels with high reliability are not influenced by the 2-D data avoiding texture copying Garcia et al., “A New Multilateral Filter for Real-Time Depth Enhancement”, AVSS 2011
Depth map enhancement examples JBU PWAS UML 2D guidance image JBU PWAS UML 2D guidance image
Again, is UML ideal enough? • Limitations of UML: • Features of the proposed filter strongly depends on the credibility map • If reference pixel value in credibility map is low, The filter works as the normal PWAS filter by in order to reduce edge blurring artifact by weakening smoothing effect around depth edge. • If reference pixel value in credibility map is high, Relatively high weigh is allocated to J3, and the proposed filter works in direction of reducing texture copy artifact. • Remaining problems: • Is credibility map really credible? • It only considers depth gradient, but occlusion, shadowing, and homogeneous regions are really incredible in general depth data. • Edge blurring artifact still exists • when there’s no corresponding depth edge in the image due to similar object colors.
Depth map enhancement examples Downsampled (9x) Intensity image Ground truth PWAS UML JBU
Generalized depth enhancement filter by sensor fusion • Generalize the previous UML filter not only for active sensors (RGB-D) but also more traditional stereo camera. • F. Garcia, proposed in ECCV 2012 • [Garcia2012] solution: • Passive sensor: extension of credibility map for general depth data • Object boundary, occlusion, homogeneous regions are considered • Active sensor: adaptive blending function β(p) change to cope with edge blurring issue, and the second term (J3(p)) in UML is substituted by D(p) • Representative formulation: • Smoothing effect is reduced in credible depth regions • The same computational with PWAS complexity • New β(p) prevents edge blurring when image edges have similar color Garcia et al., “Generalized depth enhancement filter by sensor fusion”, ECCV 2012
Generalized depth enhancement filter by sensor fusion • Formulation of a new credibility map (Q(p): • Boundary map Qb(p) Qb(p) = Q(p) in J2 • Occlusion map Qo(p): • Homogeneous map Qh(p): the characteristics of correlation cost at each pixel is analyzed • Homogeneous region flat correlation cost / repetitive pattern multiple minima. cost • First minimum value at depth d1 C(p, d1) / second minimum at d2 C(p, d2) left/right consistency check Garcia et al., “Generalized depth enhancement filter by sensor fusion”, ECCV 2012
Generalized depth enhancement filter by sensor fusion • Formulation of blending function β(p): • QIis defined analogously to QD but considering ∇I • The function u(·) is a step function • If edge blurring condition is satisfied, β(p) = 1 • i.e., QD < τD (QD= Q, defined in (PWAS)), and QI > τI • The constants τI and τDare empirically chosen thresholds • If not, • β(p) = QD(p), and J5(p) works similarly to the conventional UML filter Garcia et al., “Generalized depth enhancement filter by sensor fusion”, ECCV 2012
Experimental results – passive sensing Garcia et al., “Generalized depth enhancement filter by sensor fusion”, ECCV 2012
Experimental results – passive sensing • RMS: Root Mean Square • PBMP : Percentage of Bad Matching Pixels • SSIM : Structural SIMilarity Garcia et al., “Generalized depth enhancement filter by sensor fusion”, ECCV 2012
Experimental results – active sensing Image UI QD β Garcia et al., “Generalized depth enhancement filter by sensor fusion”, ECCV 2012
Experimental results – active sensing Garcia et al., “Generalized depth enhancement filter by sensor fusion”, ECCV 2012
Now then, do we have an optimal solution? • Limitations: • Initial depth의 신뢰도가 낮을 경우에 image value의 문제가 있으면 해당 position의 depth를 개선할 방법이 없다. • 예를 들어, occlusion, homogeneous 영역에서 texture copying 문제가 여전히 발생 가능함. UML filter 컨셉과 충동! • Edge blurring 조건에서 depth edge 근처의 distortion 확산의 문제 • Remaining problems: • Qb, Qo, Qh의 역할이 완전히 독립적이지 못하기 때문에 over weighting의 위험이 우려됨. • 예를 들어, boundary와 occlusion영역이 겹치게 된다. • 혹은 homogeneous 영역에서 잘못 추정된 depth는 left/right consistency 결과 occlusion 영역으로 판단될 수도 있다.
Conclusion • Joint bilateral upsampling approach • Propagates properties from one to an other modality • Credibility map decides system performance • Defining blending function can be another critical factor • Many empirical parameters make the practical automated usage of such fusion filter challenging • Another question is a clear rule on when a smoothing by filtering is to be avoided and when a simple binary decision is to be undertaken