300 likes | 323 Views
Urban Scene Analysis. James Elder & Patrick Denis York University. Phase IV Objectives. Single-View 3D Reconstruction Scene Dynamics Scene Segmentation and Labelling. Single-View Reconstruction. Streetscape. Ultimate Goal. Our ultimate goal is to automate this process!. Immediate Goal.
E N D
Urban Scene Analysis James Elder & Patrick Denis York University
Phase IV Objectives • Single-View 3D Reconstruction • Scene Dynamics • Scene Segmentation and Labelling
Ultimate Goal • Our ultimate goal is to automate this process!
Immediate Goal • Automatic estimation of the three vanishing points corresponding to the “Manhattan directions”.
Manhattan Frame Geometry • An edge is aligned to a vanishing point if the interpretation plane normal is orthogonal to the vanishing point vector in the Gauss Sphere
Mixture Model Image • Each edge Eij in the image is generated by one of four possible kinds of scene structure: • m1-3: a line in one of the three Manhattan directions • m4: non-Manhattan structure • The observable properties of each edge Eij are: • position • angle • The likelihoods of these observations are co-determined by: • The causal process (m1-4) • The rotation Ψ of the Manhattan frame relative to the camera mi mi E11 E12 Ψ mi mi E22 E21
Mixture Model Image • Our goal is to estimate the Manhattan frame Ψ from the observable data Eij. mi mi E11 E12 Ψ mi mi E22 E21
Design Criteria • Accuracy • Speed
Design Decisions • Features • Dense gradient map • Sparse sub-pixel localized edges • Measurement Space • Image • Gauss Sphere • Search Method • Coarse-to-Fine (Coughlan & Yuille 2001) • Quasi-Newton • EM • Quasi-EM
Accuracy 12 MW Edge-Based Coarse-to-Fine MW Params 10 Edge-Based Newton MW Params Edge-Based Newton 8 Edge-Based EM Edge-Based Quasi-EM 6 Edge-Based Quasi-EM GS Anuglar Error (deg) 4 2 0 Horizontal VPs Vertical VP Error Type
Speed 160 155 MW 150 Edge-Based Coarse-to-Fine MW Params 145 Edge-Based Newton MW Params 140 Edge-Based Newton Edge-Based EM Time (sec) Edge-Based Quasi-EM 25 Edge-Based Quasi-EM GS 20 15 10 5 0 Method
Conclusions • We have developed an algorithm for automatically estimating the Manhattan frame from a single camera. • This algorithm is 40% more accurate and roughly 3 times faster than the leading prior method. • This algorithm will be used as a basis for single-view reconstruction of urban scenes.
Single-View Reconstruction • Potential Research Objectives for Phase IV • Recover connected Manhattan cuboids • Connected, labelled line segments • Connected, labelled rectangular facets • Estimate scale factor • From pedestrian, vehicle traffic • From building features whose size is approximately known (e.g., doors) • Integrate with other data sources • Existing 3D models on coarser scale • 3D models from cameras with overlapping fields of view
FOVEAL IMAGE TILT PAN WIDE-FIELD IMAGE Projects: Pre-Attentive and Attentive Sensing
Motion Region Log Likelihood Ratio 4 2 Joint Region Log Likelihood Ratio 0 4 -2 2 Foreground Region Log Likelihood Ratio -4 4 0 2 -2 0 -4 -2 -4 Skin Region Log Likelihood Ratio 4 2 0 -2 -4 Statistical Integration of Weak Cues
confirmed face location mean body indicator motion kernel spatial prior gaze command prior posterior random sampler gaze control high-resolution face detection non-max suppression likelihood Attentive sensor motion kernel Attentive Feedback Loop
Scene Dynamics • Potential Research Objectives for Phase IV • Person re-identification • Individuation (counting) in crowds
Scene Segmentation • Potential Research Objectives for Phase IV • Application to urban scenes • Scene layout • Ground plane • Buildings • Vegetation • Sky • Material recognition • Integrated text recognition