Vision: Inferring Information from Clues

Vision: Inferring Information from Clues Outline: Stereo vision as an AI problem Stereograms Geometry of stereograms Computing correspondences Letting cues vote for hypotheses Polar representation of a line Hough transform Gestalt grouping CSE 415 -- (c) S. Tanimoto, 2004 Inference in Vision

Stereo Vision as an AI Problem Projection from 3 dimension to 2 loses information. With 2 projections, we can gain back some of that information. Recovering the missing information is an inference problem. The missing information is constrained by knowledge about the real world and assumptions about the scene. The use of knowledge and assumptions to make inferences is a standard approach in artificial intelligence. CSE 415 -- (c) S. Tanimoto, 2004 Inference in Vision

Stereograms Two-view stereograms: 1. spatially separated left-eye/right-eye pair (including virtual-reality goggles) 2. superimposed, with separation using color filters. 3. superimposed, with temporal shuttering. 4. superimposed, with separation using polarizing filters. Single-view stereograms: 1. Magic-eye pictures with depth-modulated carrier. 2. Wallpaper offering depth effects due to its periodicity. CSE 415 -- (c) S. Tanimoto, 2004 Inference in Vision

Geometry of Stereograms CSE 415 -- (c) S. Tanimoto, 2004 Inference in Vision

Computing Correspondence Approach 1: Extract features and find a consistent matching of features in each view. Approach 2: Directly compute a disparity map, performing local correlations of the views. CSE 415 -- (c) S. Tanimoto, 2004 Inference in Vision

Inferring Trends via Voting Methods The classical Hough Transform identifies prominent lines in a scene by letting each edge point vote for the line(s) it is on. Voting methods can do well under noisy conditions. Votes are tallied in an array of accumulators, indexed by theta and rho (polar parameters of a line). ρ = x cos θ + y sin θ. CSE 415 -- (c) S. Tanimoto, 2004 Inference in Vision

Letting a Point Vote for all the Lines that Pass Through It CSE 415 -- (c) S. Tanimoto, 2004 Inference in Vision

Hough Transform: Polar representation ρ = x cos θ + y sin θ. (x, y) ρ (0, 0) θ CSE 415 -- (c) S. Tanimoto, 2004 Inference in Vision

Hough Transform (Cont.) nondirectional, unweighted Hough Transform: H(θ,ρ) =Σ Σ f(x,y) δ(x cos θ + y sin θ - ρ). δ(x) = 1 if | x | < 1 0 otherwise CSE 415 -- (c) S. Tanimoto, 2004 Inference in Vision

Gestalt Grouping CSE 415 -- (c) S. Tanimoto, 2004 Inference in Vision

Gestalt Grouping Texture element = “texel” Texel directionality Texel granularity Alignments of endpoints Spacing of texels Groups cue for surfaces, objects. CSE 415 -- (c) S. Tanimoto, 2004 Inference in Vision

Vision: Inferring Information from Clues