1 / 30

Integrating Recognition for Cognitive Scene Interpretation

Explore the challenges and pathways for recognizing and reconstructing scenes in urban traffic analysis, enhancing driver assistance systems with real-time structure-from-motion and dense reconstruction.

picard
Download Presentation

Integrating Recognition for Cognitive Scene Interpretation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Integrating Recognition and Reconstruction for Cognitive Scene Interpretation Bastian Leibe, Nico Cornelis, Kurt Cornelis, Luc Van GoolComputer Vision Laboratory ETH Zurich Sicily Workshop, Syracusa, 22.09.2006 & VISICSKU Leuven CVPR’06 Video ProceedingsDAGM’06

  2. Motivation • Urban traffic scene analysis from a moving vehicle • Detect objects in the image • Localize them in 3D • Build up a metric scene model • Applications e.g. in driver assistance systems B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  3. Challenges Motion blur Brightly-lit areas Lense flaring Dark shadows + Intra-category variability, multiple viewpoints, partial occlusion, ... B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  4. Cognitive Loop with 3D Geometry • Connect recognition and reconstruction • Reconstruction pathway delivers scene geometry  Greatly improves recognition performance • Recognition detects objects that disturb reconstruction  More accurate geometry estimate B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  5. Outline • Hardware setup • Reconstruction pathway • Real-time Structure-from-Motion • Real-time dense reconstruction • Recognition pathway • Local-feature based object detection • Incorporation of scene geometry • Temporal integration in world coordinate frame • Feedback to reconstruction • Results and Conclusion B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  6. Hardware Setup • Stereo camera rig mounted on top of the vehicle • Calibrated w.r.t. wheel base points • Video streams captured at 25 fps, 360288 resolution B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  7. Real-Time Structure-from-Motion • Basis: very fast feature matching • Simple features • Optimized for urban environment • Only computed on green channel of a single camera • Rest: standard SfM pipeline [Cornelis et al., CVPR’06] B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  8. Real-Time Dense Reconstruction • Dense reconstruction on rectified images • Ruled surface assumption to speed-up dense reconstruction • Correlation measure: Sum of per-pixel SSDs along vertical lines • Line-sweep algorithm with ordering constraints (DP) • Fast computation on GPU • Errors introduced by pixels not belonging to facades! [Cornelis et al., CVPR’06] B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  9. Real-Time Dense Reconstruction (2) • Merge dense reconstructions using known camera poses. • “Voted polygon carving” on 2D projection B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  10. Real-Time Dense Reconstruction (2) • Merge dense reconstructions using known camera poses. • “Voted polygon carving” on 2D projection • Surfaces registered on world map using GPS B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  11. Original 3D Reconstruction Textured 3D Model • Run-times • SfM + Bundle adjustment: 26-30 fps on CPU • Dense reconstruction: 26 fps on GPU B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  12. Information Flow into Recognition • For each frame, 3D reconstruction delivers • External camera calibration • Ground plane estimate  Used for improving recognition of the next frame. B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  13. Appearance-Based Car Detection • Bank of 5 single-view ISM detectors • Each based on 3 local cues • Harris-Laplace, Hessian-Laplace, and DoG interest regions • Local Shape Context descriptors • Semi-profile detectors additionally mirrored • Not real-time yet… [Leibe, Mikolajczyk, Schiele,06] B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  14. … … … training images(+reference segmentation) … Appearance codebook y y s s x x y y s s x x Spatial occurrence distributions Implicit Shape Model - Representation • Learn appearance codebook • Extract patches at interest points • Agglomerative clustering  codebook • Learn spatial distributions • Match codebook to training images • Record matching positions on object B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  15. Matched Codebook Entries Probabilistic Voting y s x 3D Voting Space(continuous) Backprojectionof Maxima BackprojectedHypotheses Implicit Shape Model - Recognition Interest Points [Leibe & Schiele,04] B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  16. Matched Codebook Entries Probabilistic Voting Interest Points y s x 3D Voting Space(continuous) Segmentation Backprojectionof Maxima p(figure)Probabilities BackprojectedHypotheses Implicit Shape Model - Recognition [Leibe & Schiele,04] B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  17. recognitionscore (2D) 2D/3D Interactions • Likelihood of 3D hypothesis H given image I and 2D detections h: • 2D recognition score • Expressed in terms of per-pixel p(figure) probabilities B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  18. 3D prior y s x 2D/3D Interactions • Likelihood of 3D hypothesis H given image I and 2D detections h: • 3D prior • Distance prior (uniform range) • Size prior (Gaussian)  Significantly reduced search space Search corridor B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  19. 2D/3D transfer 2D/3D Interactions • Likelihood of 3D hypothesis H given image I and 2D detections h: • 2D/3D transfer • Two image-plane detections are consistent if they correspond to the same 3D object  Multi-viewpoint integration  Multi-camera integration B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  20. Detections Using Ground Plane Constraints left camera1175 frames B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  21. Quantitative Results • Detection performance on first 600 frames • All cars annotated that were >50% visible • Ground plane constraint significantly improves precision • Performance: 0.2 fp/image at 50% recall B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  22. Temporal Integration • Temporal integration in world coordinate frame • Using external camera calibration from SfM. • Each detection transfers to a 3D observation H. • Find superset of 3D hypotheses . • Estimate orientation using cluster shape & detected viewpoints. • Select set of 3D hypotheses that best explain the observations. B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  23. likelihood ofmembership tohypothesis temporaldecay penalty forphysicaloverlap Hypothesis Selection for 3D Detections • Quadratic Boolean Optimization Problem (from MDL) • Individual scores (diagonal terms) • Interaction costs (off-diagonal terms) [Leonardis et al,95] B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  24. Result of Temporal Integration B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  25. Online 3D Car Location Estimates B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  26. 3D Estimates After Convergence B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  27. Feedback into 3D Reconstruction • Feedback of detections & segmentation maps • Used to discard features on cars for SfM • Used to mask out cars in dense reconstruction  More accurate 3D estimates in the next frame. B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  28. Original 3D Reconstruction Another Application: 3D City Modeling Enhancing your driving experience… B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  29. Conclusion • System for traffic scene analysis integrating • Structure-from-Motion • Dense 3D Reconstruction • Object detection and localization in 2D and 3D • Temporal integration in world coordinate frame • Cognitive Loop between 2D and 3D processing • Reconstruction delivers camera calibration, ground plane • 3D context tremendously improves recognition performance • Car detection, segmentation makes 3D estimation more accurate • System applied to challenging real-world task • Real-time 3D reconstruction (26-30 fps) • Accurate object detection & 3D pose estimation results B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool

  30. Thank you very much for your attention! B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool http://www.vision.ethz.ch/ http://www.esat.kuleuven.be/psi/visics/

More Related