760 likes | 950 Views
Coherent Scene Understanding with 3D Geometric Reasoning. Jiyan Pan 12/3/2012. Task. Detect objects. Identify surface regions. Geometrically coherent in the 3D world. Estimate ground plane. Infer gravity direction. 3D geometric context. Coordinate system.
E N D
Coherent Scene Understanding with 3D Geometric Reasoning Jiyan Pan 12/3/2012
Task Detect objects Identify surface regions Geometrically coherent in the 3D world Estimate ground plane Infer gravity direction 3D geometric context
Coordinate system Variables of global 3D geometries: ng, np, hp (inverse) gravity image plane ng object vertical orientation focal length nv f dt xt α real world height H object depth camera center xb γ db θ object landmarks object pitch and roll angles Deterministic relationships ground plane height hp np ground plane orientation ground plane
Coordinate system (inverse) gravity image plane ng object vertical orientation focal length nv f dt xt α real world height H object depth camera center xb γ db θ object landmarks object pitch and roll angles Probabilistic relationships ground plane height hp Derived from appearance np ground plane orientation Prior knowledge ground plane
Can we solve them all for a coherent solution? • Non-linear • Non-deterministic • Even invalid equations from false detections
Global 3D context √ Local 3D context √ X √ √
Global 3D context √ Local 3D context √ X ? √ • “Chicken and egg” problem: • Local entities could be validated by global 3D context • Global 3D context is induced from local entities √
Possible solution (All in PGM) • Put both global 3D geometries and local entities in a PGM [1] • Precision issue: Have to quantize continuous variables • Complexity issue: Pairwise potential would contain up to ~1e6 entries • 100(pitch) × 100 (roll) × 100 (height) Ground Gravity o1 ok o2 [1] D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008
Possible solution (Fixed global geometries as hypotheses) • Task much easier under a fixed hypothesis of global 3D geometries Ground Gravity × × × × × × o1 ok o2
Possible solution (Fixed global geometries as hypotheses) • Task much easier under a fixed hypothesis of global 3D geometries How to generate global 3D geometry hypotheses? ω1 ω3 ω2 o1 ok o2
Possible solution(Hypotheses by exhaustive search) • Exhaustive search over the quantized space of global 3D geometries [2] • Computational complexity tends to limit search space [2] S. Baoet al. Toward coherent object detection and scene layout understanding. IVC, 2011
Possible solution(Hypotheses by Hough voting) • Each local entity casts vote to the Hough voting space of the global 3D geometries and peaks are selected[3] • False detections could corrupt the votes • Would applying EM help? Not likely, if false detections overwhelm L1 L4 L5 L2 L6 L7 L3 [3] M. Sun et al. Object detection with geometrical context feedback loop. BMVC, 2010
Our solution • We take a RANSAC-like approach: Randomly mix the contributions of local entities L1 L4 L5 L2 L6 L7 L3
Our solution • We take a RANSAC-like approach: Randomly mix the contributions of local entities L1 L4 L5 L2 L6 L7 L3
Our solution • We take a RANSAC-like approach: Randomly mix the contributions of local entities • Compared to averaging over all local entities: More robust against outliers • Compared to directly using estimates from each single local entity: More robust against noise L1 L4 L5 L2 L6 L7 L3
Gravity Direction 3 Individual Mixture 2.8 Average 2.6 2.4 Minimum hypothesis error 2.2 2 1.8 1.6 0 5 10 15 20 25 30 35 40 45 50 Number of random mixtures
Ground Plane Orientation Individual 3.2 Mixture Average 3 2.8 2.6 Minimum hypothesis error 2.4 2.2 2 1.8 1.6 0 5 10 15 20 25 30 35 40 45 50 Number of random mixtures
Global 3D context √ Local 3D context √ X √ √
3D geometric context #1: Common ground (global) invalid (#1) valid ground plane orientation valid invalid (#1) invalid (#1) ground plane
3D geometric context #2: Gravity direction (global) (inverse) gravity ground plane orientation invalid (#2) ground plane
3D geometric context #3: Depth ordering (local) (inverse) gravity ground plane orientation incompatible (#3) ground plane
3D geometric context #4: Space occupancy (local) (inverse) gravity ground plane orientation incompatible (#4) ground plane
6 5 4 3 2 1
Given a global 3D geometry hypothesis Global geometric compatibility for an object: Orientation: 6 5 4 3 2 1
Given a global 3D geometry hypothesis Global geometric compatibility for an object: Orientation: Height: 6 5 4 3 2 1
Given a global 3D geometry hypothesis Global geometric compatibility for a surface: Orientation: local estimates vs. or Location: horizontal surface region vs. ground horizon 6 5 4 3 2 1
Given a global 3D geometry hypothesis Local geometric compatibility for two objects: Depth ordering: Space occupancy: 6 5 4 3 2 1
Given a global 3D geometry hypothesis Objective function of the CRF: 6 5 4 3 2 1
Global 3D context √ Local 3D context √ X √ Best hypothesis √
3D reasoning agrees with raw detector 3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector
3D reasoning agrees with raw detector 3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector
3D reasoning agrees with raw detector 3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector
3D reasoning agrees with raw detector 3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector
3D reasoning agrees with raw detector 3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector
3D reasoning agrees with raw detector 3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector
3D reasoning agrees with raw detector 3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector
3D reasoning agrees with raw detector 3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector
3D geometric reasoning improves object detection performance Deformable Part Model Detector 0.7 Baseline Hoiem 0.6 Ours 0.5 0.4 True Positive Rate 0.3 0.2 0.1 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 False Positive per Image D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008
3D geometric reasoning improves object detection performance Dalal-Triggs Detector 0.8 Baseline Hoiem 0.7 Ours 0.6 0.5 True Positive Rate 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 1.2 False Positive per Image D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008
3D geometric reasoning improves object detection performance D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008 M. Sun et al. Object detection with geometrical context feedback loop. BMVC, 2010
D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008 M. Sun et al. Object detection with geometrical context feedback loop. BMVC, 2010
Global 3D context √ Local 3D context √ X √ Best hypothesis √
Contributions of different geometric context Detection ROC Curve Det 0.7 Det+IdvlGeo Det+PairGeo 0.6 Det+FullGeo 0.5 0.4 True Positive Rate 0.3 0.2 0.1 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 False Positive per Image