1 / 30

Urban Scene Analysis

Urban Scene Analysis. James Elder & Patrick Denis York University. Phase IV Objectives. Single-View 3D Reconstruction Scene Dynamics Scene Segmentation and Labelling. Single-View Reconstruction. Streetscape. Ultimate Goal. Our ultimate goal is to automate this process!. Immediate Goal.

tcalderon
Download Presentation

Urban Scene Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Urban Scene Analysis James Elder & Patrick Denis York University

  2. Phase IV Objectives • Single-View 3D Reconstruction • Scene Dynamics • Scene Segmentation and Labelling

  3. Single-View Reconstruction

  4. Streetscape

  5. Ultimate Goal • Our ultimate goal is to automate this process!

  6. Immediate Goal • Automatic estimation of the three vanishing points corresponding to the “Manhattan directions”.

  7. Ground Truth Database

  8. Manhattan Frame Geometry • An edge is aligned to a vanishing point if the interpretation plane normal is orthogonal to the vanishing point vector in the Gauss Sphere

  9. Mixture Model Image • Each edge Eij in the image is generated by one of four possible kinds of scene structure: • m1-3: a line in one of the three Manhattan directions • m4: non-Manhattan structure • The observable properties of each edge Eij are: • position • angle • The likelihoods of these observations are co-determined by: • The causal process (m1-4) • The rotation Ψ of the Manhattan frame relative to the camera mi mi E11 E12 Ψ mi mi E22 E21

  10. Mixture Model Image • Our goal is to estimate the Manhattan frame Ψ from the observable data Eij. mi mi E11 E12 Ψ mi mi E22 E21

  11. Design Criteria • Accuracy • Speed

  12. Design Decisions • Features • Dense gradient map • Sparse sub-pixel localized edges • Measurement Space • Image • Gauss Sphere • Search Method • Coarse-to-Fine (Coughlan & Yuille 2001) • Quasi-Newton • EM • Quasi-EM

  13. Accuracy 12 MW Edge-Based Coarse-to-Fine MW Params 10 Edge-Based Newton MW Params Edge-Based Newton 8 Edge-Based EM Edge-Based Quasi-EM 6 Edge-Based Quasi-EM GS Anuglar Error (deg) 4 2 0 Horizontal VPs Vertical VP Error Type

  14. Speed 160 155 MW 150 Edge-Based Coarse-to-Fine MW Params 145 Edge-Based Newton MW Params 140 Edge-Based Newton Edge-Based EM Time (sec) Edge-Based Quasi-EM 25 Edge-Based Quasi-EM GS 20 15 10 5 0 Method

  15. Conclusions • We have developed an algorithm for automatically estimating the Manhattan frame from a single camera. • This algorithm is 40% more accurate and roughly 3 times faster than the leading prior method. • This algorithm will be used as a basis for single-view reconstruction of urban scenes.

  16. Single-View Reconstruction • Potential Research Objectives for Phase IV • Recover connected Manhattan cuboids • Connected, labelled line segments • Connected, labelled rectangular facets • Estimate scale factor • From pedestrian, vehicle traffic • From building features whose size is approximately known (e.g., doors) • Integrate with other data sources • Existing 3D models on coarser scale • 3D models from cameras with overlapping fields of view

  17. FOVEAL IMAGE TILT PAN WIDE-FIELD IMAGE Projects: Pre-Attentive and Attentive Sensing

  18. Motion Region Log Likelihood Ratio 4 2 Joint Region Log Likelihood Ratio 0 4 -2 2 Foreground Region Log Likelihood Ratio -4 4 0 2 -2 0 -4 -2 -4 Skin Region Log Likelihood Ratio 4 2 0 -2 -4 Statistical Integration of Weak Cues

  19. Wide-Field Person Detection

  20. confirmed face location mean body indicator motion kernel spatial prior gaze command prior posterior random sampler gaze control high-resolution face detection non-max suppression likelihood Attentive sensor motion kernel Attentive Feedback Loop

  21. Attentive High-Res Video Surveillance

  22. Pose-Invariant Face Recognition(with Simon Prince, UCL)

  23. Projects: 3D Facial Estimation and Modelling

  24. Scene Dynamics • Potential Research Objectives for Phase IV • Person re-identification • Individuation (counting) in crowds

  25. Scene Segmentation

  26. Using Prior Knowledge: Example

  27. Experimental Results

  28. Scene Segmentation • Potential Research Objectives for Phase IV • Application to urban scenes • Scene layout • Ground plane • Buildings • Vegetation • Sky • Material recognition • Integrated text recognition

More Related