590 likes | 875 Views
CV 輪講 Putting Objects in Perspective. 藤吉研究室 土屋成光 2008 年 7 月 1 日. Back ground. 一般物体認識 / 画像シーン認識 低解像度 見えの違い 奥行きによるサイズの違い ⇒局所的な認識法が通用しない 人間は物体間の関係を利用 三次元構造のモデル化 局所的な認識手法を高精度に. Putting Objects in Perspective. Derek Hoiem , Alexei A. Efros , Martial Hebert
E N D
CV輪講Putting Objects in Perspective 藤吉研究室 土屋成光 2008年7月1日
Background • 一般物体認識/画像シーン認識 • 低解像度 • 見えの違い • 奥行きによるサイズの違い ⇒局所的な認識法が通用しない • 人間は物体間の関係を利用 • 三次元構造のモデル化 • 局所的な認識手法を高精度に
Putting Objects in Perspective Derek Hoiem,Alexei A. Efros,Martial Hebert Carnegie Mellon UniversityRobotics Institute CVPR2006
Local Object Detection False Detections True Detection Missed Missed True Detections Local Detector: [Dalal-Triggs 2005]
Object Surface? Support? Surface Estimation Image Support Vertical Sky V-Center V-Right V-Porous V-Solid V-Left [Hoiem, Efros, Hebert ICCV 2005] Software available online
Object Size in the Image Image World
Object Size ↔ Camera Viewpoint Loose Viewpoint Prior Input Image
Object Size ↔ Camera Viewpoint Loose Viewpoint Prior Input Image
Object Size ↔ CameraViewpoint Viewpoint Object Position/Sizes
Object Size ↔ Camera Viewpoint Viewpoint Object Position/Sizes
Object Size ↔ Camera Viewpoint Viewpoint Object Position/Sizes
Object Size ↔ Camera Viewpoint Viewpoint Object Position/Sizes
Efficient from surfaceand viewpoint Image P(surfaces) P(viewpoint) P(object) P(object | surfaces) P(object | viewpoint)
Efficient from surfaceand viewpoint Image P(surfaces) P(viewpoint) P(object | surfaces, viewpoint) P(object)
Scene Parts Are All Interconnected Objects 3D Surfaces Camera Viewpoint
Input to Algorithm Object Detection Surface Estimates Viewpoint Prior Local Car Detector Local Ped Detector Local Detector: [Dalal-Triggs 2005] Surfaces: [Hoiem-Efros-Hebert 2005]
Approximate Model Objects 3D Surfaces Viewpoint
Inference over Tree Viewpoint θ Local Object Evidence Local Object Evidence Objects ... o1 on Local Surface Evidence Local Surface Evidence Local Surfaces … s1 sn
Viewpoint estimation Viewpoint Prior Viewpoint Final Likelihood Likelihood Horizon Height Horizon Height
ObjectIdentitie • Localdetector
Surface Geometry Probability map
Object detection Car: TP / FP Ped: TP / FP Initial (Local) Final (Global) Car Detection 4 TP / 1 FP 4 TP / 2 FP Ped Detection 4 TP / 0 FP 3 TP / 2 FP Local Detector: [Dalal-Triggs 2005]
Experiments on LabelMe Dataset • Testing with LabelMe dataset: 422 images • 923 Cars at least 14 pixels tall • 720 Peds at least 36 pixels tall
Each piece of evidence improves performance Pedestrian Detection Car Detection Local Detector from [Murphy-Torralba-Freeman 2003]
Can be used with any detector that outputs confidences Car Detection Pedestrian Detection Local Detector: [Dalal-Triggs 2005] (SVM-based)
Accurate Horizon Estimation [Murphy-Torralba-Freeman 2003] [Dalal- Triggs 2005] Horizon Prior Median Error: 8.5% 4.5% 3.0% 90% Bound:
Qualitative Results Car: TP / FP Ped: TP / FP Initial: 2 TP / 3 FP Final: 7 TP / 4 FP Local Detector from [Murphy-Torralba-Freeman 2003]
Qualitative Results Car: TP / FP Ped: TP / FP Initial: 1 TP / 14 FP Final: 3 TP / 5 FP Local Detector from [Murphy-Torralba-Freeman 2003]
Qualitative Results Car: TP / FP Ped: TP / FP Initial: 1 TP / 23 FP Final: 0 TP / 10 FP Local Detector from [Murphy-Torralba-Freeman 2003]
Qualitative Results Car: TP / FP Ped: TP / FP Initial: 0 TP / 6 FP Final: 4 TP / 3 FP Local Detector from [Murphy-Torralba-Freeman 2003]
Geometric Context • Estimate surface ground: green, sky: blue, vertical: red, o:porous, x: solid
Perspective Geometric Cues Color Texture Location
Robust Spatial Support • oversegmentation RGB Pixels Superpixels [Felzenszwalb and Huttenlocher 2004]
Multiple Segmentations • 単一のセグメントではセグメントエラーの可能性 • 複数のセグメント数でセグメンテーション Multiple Segmentations Superpixels …
Labeling Segments … … 各セグメント結果を統合
Learn from training images • 前準備 • multiple segmentationの算出 • 各セグメントのラベルの算出 – ground, vertical, sky, or “mixed” • boosted decision trees による密度計算 • 8 nodes per tree • Logistic regression version of Adaboost [Collins and Schapire and Singer 2002] Homogeneity Likelihood Label Likelihood
Image Labeling Labeled Segmentations … Learned from training images Labeled Pixels
meters meters Summary & Future Work Ped Ped Car • Reasoning in 3D: • Object to object • Scene label • Object segmentation
Conclusion • Image understanding is a 3D problem • Must be solved jointly • This paper is a small step • Much remains to be done
CV輪講Recovering Occlusion Boundaries from a Single Image,Closing the Loop in Scene Interpretation 藤吉研究室 土屋成光 2008年8月26日
Background • 一般物体認識/画像シーン認識 • 低解像度 • 見えの違い • 奥行きによるサイズの違い ⇒局所的な認識法が通用しない • 人間は物体間の関係を利用 • 三次元構造のモデル化 • 局所的な認識手法を高精度に
Recovering Occlusion Boundaries from a Single Image Derek Hoiem,Andrew N. Stein, Alexei A. Efros,Martial Hebert Carnegie Mellon UniversityRobotics Institute ICCV’07
単画像からのオクルージョン理解 • オクルージョン,境界理解 • 物体を探索する際に必須 • Edge, region, depthによって推定
手法の流れ • 千領域にセグメンテーション Watershed with Pb soft boundaries • Region, Boundary, 3D Cuesの算出 depth : horizon + junction to ground • Boundaryの算出 Conditional random field (CRF) • Boundaryを用いて更にセグメンテーション