880 likes | 1.08k Views
What happened last time?. Fundamentals of stereo visionEpipolar geometry and rectificationConversion Disparity => DepthChallenges in stereo matchingAmbiguityOcclusion ProblemAssumptionsSmoothness AssumptionUniqueness AssumptionMiddlebury stereo benchmark. 2. What is Going to Happen Today?.
E N D
1. Local Methods Michael Bleyer
LVA Stereo Vision
2. What happened last time? Fundamentals of stereo vision
Epipolar geometry and rectification
Conversion Disparity => Depth
Challenges in stereo matching
Ambiguity
Occlusion Problem
Assumptions
Smoothness Assumption
Uniqueness Assumption
Middlebury stereo benchmark 2
3. What is Going to Happen Today? Principle of Local Methods
Advantages/Problems
Adaptive Windows
Multiple Window Methods
Adaptive Support Weight Methods
Slanted Surfaces / Plane Sweeping
Occlusion Handling in Local Stereo 3
4. Michael Bleyer
LVA Stereo Vision
5. A Naive Stereo Algorithm For each pixel p of the left image:
Compare color of p against the color of each pixel on the same horizontal scanline in the right image.
Select the pixel of most similar color as matching point
Result: 5
6. Window-Based Matching (1) Instead of matching single pixel, we center a small window on a pixel and match the whole window in the right image. 6
7. Window-Based Matching (2) In a formal way, the disparity dp of a pixel p in the left image is computed aswhere
argmin returns the value at which the function takes a minimum
dmax is a parameter defining the maximum disparity (search range).
Wp is the set of all pixels inside the window centered on p.
c(p,q) is a function that computes the color difference between a pixel p of the left and a pixel q of the right image (e.g. summed-up absolute differences in RGB values).
7
8. Window-Based Matching (2) In a formal way, the disparity dp of a pixel p in the left image is computed aswhere
argmin returns the value at which the function takes a minimum
dmax is a parameter defining the maximum disparity (search range).
Wp is the set of all pixels inside the window centered on p.
c(p,q) is a function that computes the color difference between a pixel p of the left and a pixel q of the right image (e.g. summed-up absolute differences in RGB values).
8
9. Michael Bleyer
LVA Stereo Vision
10. Computational Aspects In computing the aggregated matching scores, run-time depends on window size => computationally quite slow
There is a trick known as sliding window technique to remove this dependency
Enables real-time implementations
Forms the core of all commercial stereo solutions (e.g. Point Grey, Small Vision System)
10
11. Sliding Window Technique For computing the aggregated matching costs Ax,y for window Wx,y, we need to compute
Ax,y = c1 + c2 + c3 + c4 + c5
with c1-c5 denoting the matching costs of individual pixels.
The aggregated costs for the window one pixel right of Wx,y is computed by
Ax+1,y = c2 + c3 + c4 + c5 + c6. 11
12. Sliding Window Technique For computing the aggregated matching costs Ax,y for window Wx,y, we need to compute
Ax,y = c1 + c2 + c3 + c4 + c5
with c1-c5 denoting the matching costs of individual pixels.
The aggregated costs for the window one pixel right of Wx,y is computed by
Ax+1,y = c2 + c3 + c4 + c5 + c6. 12
13. Sliding Window Technique We can write Ax+1 as
Ax+1,y = Ax - c1 + c6.
No dependency on window size (due to incremental computation).
To aggregate square windows, first aggregate image rows, then columns (see [Muehlmann, IJCV02]). 13
14. Result of Window-Based Matching The window size is a crucial parameter. 14
15. Result of Window-Based Matching The window size is a crucial parameter. 15
16. Problem of Untextured Regions There has to be a certain amount of color variation inside the window. 16
17. Aperture Problem There needs to be a certain amount of texture with vertical orientation. 17
18. Problem of Repetitive Patterns There needs to be a certain amount of non-repetitive texture. 18
19. Effect of these Problems 19
20. Foreground Fattening Problem (1) By using a window as matching primitive we have applied an implicit smoothness assumption:
All pixels within the window are assumed to have the same disparity.
This leads to a systematic error in regions close to disparity discontinuities.
20
21. Foreground Fattening Problem (2) Background regions close to disparity discontinuities tend to be erroneously assigned to the foreground disparity. 21
22. Foreground Fattening Problem (3) Foreground objects are clearly enlarged. 22
23. Large Versus Small Windows Large windows better to handle:
Untextured Regions
Aperture Problem
Repetitive Patterns
Small windows better to handle:
Foreground Fattening Effect
Problem:
There is no ‘optimal’ window size that can handle all these problems at once.
23
24. Michael Bleyer
LVA Stereo Vision
25. Adaptive Windows Combine advantages of small and large windows.
Estimate a ‘good’ window individually for each pixel.
Research on local stereo mostly concentrates on this topic.
25
26. Adaptive Windows of [Fusiello, CVPR97] Center 9 windows at each pixel
Assumption:
At least one window does not overlap a disparity discontinuity.
Take the window which has minimum aggregated costs among all 9 windows.
Computational very efficient (if implemented smartly)
26
27. Adaptive Windows of [Fusiello, CVPR97] 27
28. Adaptive Windows of [Fusiello, CVPR97] 28
29. Adaptive Windows of [Fusiello, CVPR97] 29
30. Adaptive Windows of [Hirschmueller, IJCV02] Divide window into 9 sub-windows.
Compute matching score by aggregating costs over the 5 best sub-windows (i.e., the ones of minimum color dissimilarity)
There exist many other variations of this idea
30
31. 31
32. Slight improvements in regions close to disparity boundaries can be observed.
Results still only of moderate quality.
32
33. Adaptive Support Weight Approaches (1) Formula for computing aggregated matching costs of pixel p at disparity d:where Wp is the set of all pixels in the window centered on p.
Problem:
All pixels in Wp have the same influence on the aggregated costs => foreground fattening
33
34. Adaptive Support Weight Approaches (2) Let us modify this formula:where w(p,q) is a weight function that determines the likelihood with which p and q lie on the same disparity.
Ideally:
Advantage:
Only pixels that lie on the same disparity contribute to the aggregated matching costs => no foreground fattening
34
35. Adaptive Support Weight Approaches (2) Let us modify this formula:w(p,q) is a weight function that determines the likelihood with which p and q lie on the same disparity.
Ideally:
Advantage:
Only pixels that lie on the same disparity contribute to the aggregated matching costs => no foreground fattening
35
36. How to compute the weights? (1) We can make use of monocular cues, i.e. cues that are present when only one image is available.
The human is very skilled in interpreting monocular cues:
You can tell the relative depth ordering from just this single image.
36
37. How to compute the weights? (2) Which pixel lies on the same disparity with the window center pixel: p or q?
37
38. How to compute the weights? (2) Which pixel lies on the same disparity with the window center pixel: p or q?
38
39. How to compute the weights? (3) Which pixel lies on the same disparity with the window center pixel: p or q? (p and q have the same color)
39
40. How to compute the weights? (3) Which pixel lies on the same disparity with the window center pixel: p or q? (p and q have the same color)
40
41. Weight Computation of [Yoon,CVPR05] ?cpq computes color dissimilarity between p and q (e.g., summed-up difference of RGB values)
?gpq computes difference in coordinates (Euclidian distance)
?c and ?g are user parameters The exponential function converts low color difference and low spatial distance into a high weight. 41
42. Example Weight Windows of [Yoon,CVPR05] The blue box marks the window center pixel p.
The intensity values represent weight values, i.e. weight if w(p,q) = 1 and black if w(p,q) = 0. 42
43. Results of [Yoon,CVPR05] Adaptive support weight approaches deliver the best-performance among local methods at the current state-of-the-art. 43
44. Discussion of [Yoon,CVPR05] Pros:
Good-quality results in regions close to disparity boundaries and untextured regions.
Cons:
High computational costs:
The sliding window trick does not work for adaptive support weights. 44
45. Weight Computation of [Hosni,ICIP09] (1) The Window center pixel and p are close in color and spatial positions. Why can we still see that they lie on different disparities?
45
46. Weight Computation of [Hosni,ICIP09] (1) The Window center pixel and p are close in color and spatial positions. Why can we still see that they lie on different disparities?
46
47. Weight Computation of [Hosni,ICIP09] (2) We call p and c connected, if there is a path leading from one pixel to the other along which the color remains approximately constant.
47
48. Weight Computation of [Hosni,ICIP09] (2) We call p and c connected, if there is a path leading from one pixel to the other along which the color remains approximately constant. 48
49. Weight Computation of [Hosni,ICIP09] (2) We call p and c connected, if there is a path leading from one pixel to the other along which the color remains approximately constant. 49
50. Weight Computation of [Hosni,ICIP09] (2) We call p and c connected, if there is a path leading from one pixel to the other along which the color remains approximately constant.
In our example, p and c are not connected.
Idea:
Weights should be proportional to the amount of connectivity 50
51. Weight Computation of [Hosni,ICIP09] (3) ? is a user parameter
D(p,q) denotes the geodesic distance. The geodesic distance computes the costs of the shortest path that connects p with q in the color volume:
pp,q is the set of all paths that lead from p to q. A path P is a sequence of spatially neighbouring points {p1, p2, · · · , pn}. The costs d() of a path are defined as:
where c() is the color difference.
51
52. [Hosni,ICIP09] vs [Yoon,CVPR05] White arrows show situations where pixels that lie on a different disparity than the center pixel are erroneously given high weights.
The connectivity cue can overcome these situations. 52
53. Results of [Hosni,ICIP09] Currently, the best-performing local method in the Middlebury online database (http://vision.middlebury.edu/stereo/) 53
54. Michael Bleyer
LVA Stereo Vision
55. Michael Bleyer
LVA Stereo Vision
56. Michael Bleyer
LVA Stereo Vision
57. Michael Bleyer
LVA Stereo Vision
58. Michael Bleyer
LVA Stereo Vision
59. Michael Bleyer
LVA Stereo Vision
60. Michael Bleyer
LVA Stereo Vision
61. Michael Bleyer
LVA Stereo Vision
62. Michael Bleyer
LVA Stereo Vision
64. Michael Bleyer
LVA Stereo Vision
65. Results Plane Sweeping Algorithm of [Gallup,CVPR07] 65
66. Results Plane Sweeping Algorithm of [Gallup,CVPR07] 66
67. Michael Bleyer
LVA Stereo Vision
68. Occlusion Problem There are pixels that are only visible in one of the two views (red pixels in the images).
For occluded, pixels there exists no correspondence => We cannot estimate disparity 68
69. Effect of Occlusions 69
70. Effect of Occlusions 70
71. Effect of Occlusions 71
72. Effect of Occlusions 72
73. Effect of Occlusions Occluded pixels are typically matched wrongly (if not explicitly accounted for) 73
74. Left-Right Consistency Check (1) We compute 2 disparity maps
The left image is chosen as reference frame
The right image is chosen as reference frame
Left-right consistency check:
For each pixel pl of the left view:
We lookup pl ’s matching point mr in the right view using the left disparity map.
For the pixel mr, we lookup its matching point ql in the left view using the right disparity map.
If pl = ql the disparity is assumed to be correct.Otherwise, the disparity is invalidated.
Repeat this test by checking all pixels of the right view.
Check typically fails for
Occluded pixels
Mismatched pixels
74
75. Left-Right Consistency Check (2) 75
76. Left-Right Consistency Check (2) 76
77. Left-Right Consistency Check (2) 77
78. Left-Right Consistency Check (2) 78
79. Left-Right Consistency Check (2) 79
80. Left-Right Consistency Check (2) 80
81. Left-Right Consistency Check (2) 81
82. Check Implements the Uniqueness Assumption Recall the uniqueness assumption:
A pixel has at most 1 correspondence in the other image.
After computing the disparity map of the left image we have situations where 2 or more pixels of the left image have the same matching point in the right image (violation of uniqueness constraint):
mr can either project back to pl or to ql. At least one pixel (pl or ql) is therefore invalidated.
Invalidated pixels have 0 matching points. All valid pixels have exactly 1 matching points (= Uniqueness assumption fulfilled).
82
83. Occlusion Filling We have detected Occlusions by Left/Right Consistency checking.
We need assign occluded pixels to a “disparity” to derive a dense disparity map.
83
84. Occlusion Filling – A Simple Algorithm For each invalid pixel p
Find first valid pixel pl left of p
Find first valid pixel pr right of p
Set disparity dp = min (dpl, dpr) (Occluded pixels have the depth of the background)
84
85. Occlusion Filling – A Simple Algorithm For each invalid pixel p
Find first valid pixel pl left of p
Find first valid pixel pr right of p
Set disparity dp = min (dpl, dpr) (Occluded pixels have the depth of the background)
Problem:
Generates horizontal streaks
85
86. Occlusion Filling – A Simple Algorithm For each invalid pixel p
Find first valid pixel pl left of p
Find first valid pixel pr right of p
Set disparity dp = min (dpl, dpr) (Occluded pixels have the depth of the background)
Problem:
Generates horizontal streaks
Apply smoothing (Median Filter)
86
87. Summary Principle of local methods
Advantages:
Fast
Problems:
Moderate quality
Adaptive windows
Multiple window methods
Adaptive support weight methods
Plane sweeping
Occlusion handling 87
88. References A. Fusiello, V. Roberto, and E. Trucco. Efficient stereo with multiple windowing. In Conference on Computer Vision and Pattern Recognition, pages 858-863, 1997.
D. Gallup, J. Frahm, P. Mordohai, Y. Qingxiong Yang, and M. Pollefeys. Real-Time Plane-Sweeping Stereo with Multiple Sweeping Directions. In Conference on Computer Vision and Pattern Recognition, 2007.
H. Hirschmeuller, P. Innocent, and J. Garibaldi. Real-time correlation-based stereo vision with reduced border errors. International Journal of Computer Vision, 47:229-246, 2002.
A. Hosni, M. Bleyer, M. Gelautz, and Christoph Rhemann. Local Stereo Matching Using Geodesic Support Weights. International Conference on Image Processing, 2009.
K. Muehlmann, D. Maier, J. Hesser, and R. Maenner. Calculating dense disparity maps from color stereo images, an efficient implementation. International Journal of Computer Vision, 47(1):79-88, 2002.
K.J. Yoon and I.S. Kweon. Locally adaptive support-weight approach for visual correspondence search. In Conference on Computer Vision and Pattern Recognition, pages 924-931, 2005.
88