300 likes | 329 Views
Explore research objectives and design criteria for reconstructing urban scenes through single-view reconstruction, focusing on Manhattan frame estimation and algorithm development, achieving accuracy and speed improvements. Future research aims include connected Manhattan cuboid recovery, scale factor estimation, and integration with various data sources.
E N D
Urban Scene Analysis James Elder & Patrick Denis York University
Phase IV Objectives • Single-View 3D Reconstruction • Scene Dynamics • Scene Segmentation and Labelling
Ultimate Goal • Our ultimate goal is to automate this process!
Immediate Goal • Automatic estimation of the three vanishing points corresponding to the “Manhattan directions”.
Manhattan Frame Geometry • An edge is aligned to a vanishing point if the interpretation plane normal is orthogonal to the vanishing point vector in the Gauss Sphere
Mixture Model Image • Each edge Eij in the image is generated by one of four possible kinds of scene structure: • m1-3: a line in one of the three Manhattan directions • m4: non-Manhattan structure • The observable properties of each edge Eij are: • position • angle • The likelihoods of these observations are co-determined by: • The causal process (m1-4) • The rotation Ψ of the Manhattan frame relative to the camera mi mi E11 E12 Ψ mi mi E22 E21
Mixture Model Image • Our goal is to estimate the Manhattan frame Ψ from the observable data Eij. mi mi E11 E12 Ψ mi mi E22 E21
Design Criteria • Accuracy • Speed
Design Decisions • Features • Dense gradient map • Sparse sub-pixel localized edges • Measurement Space • Image • Gauss Sphere • Search Method • Coarse-to-Fine (Coughlan & Yuille 2001) • Quasi-Newton • EM • Quasi-EM
Accuracy 12 MW Edge-Based Coarse-to-Fine MW Params 10 Edge-Based Newton MW Params Edge-Based Newton 8 Edge-Based EM Edge-Based Quasi-EM 6 Edge-Based Quasi-EM GS Anuglar Error (deg) 4 2 0 Horizontal VPs Vertical VP Error Type
Speed 160 155 MW 150 Edge-Based Coarse-to-Fine MW Params 145 Edge-Based Newton MW Params 140 Edge-Based Newton Edge-Based EM Time (sec) Edge-Based Quasi-EM 25 Edge-Based Quasi-EM GS 20 15 10 5 0 Method
Conclusions • We have developed an algorithm for automatically estimating the Manhattan frame from a single camera. • This algorithm is 40% more accurate and roughly 3 times faster than the leading prior method. • This algorithm will be used as a basis for single-view reconstruction of urban scenes.
Single-View Reconstruction • Potential Research Objectives for Phase IV • Recover connected Manhattan cuboids • Connected, labelled line segments • Connected, labelled rectangular facets • Estimate scale factor • From pedestrian, vehicle traffic • From building features whose size is approximately known (e.g., doors) • Integrate with other data sources • Existing 3D models on coarser scale • 3D models from cameras with overlapping fields of view
FOVEAL IMAGE TILT PAN WIDE-FIELD IMAGE Projects: Pre-Attentive and Attentive Sensing
Motion Region Log Likelihood Ratio 4 2 Joint Region Log Likelihood Ratio 0 4 -2 2 Foreground Region Log Likelihood Ratio -4 4 0 2 -2 0 -4 -2 -4 Skin Region Log Likelihood Ratio 4 2 0 -2 -4 Statistical Integration of Weak Cues
confirmed face location mean body indicator motion kernel spatial prior gaze command prior posterior random sampler gaze control high-resolution face detection non-max suppression likelihood Attentive sensor motion kernel Attentive Feedback Loop
Scene Dynamics • Potential Research Objectives for Phase IV • Person re-identification • Individuation (counting) in crowds
Scene Segmentation • Potential Research Objectives for Phase IV • Application to urban scenes • Scene layout • Ground plane • Buildings • Vegetation • Sky • Material recognition • Integrated text recognition