330 likes | 448 Views
3D LayoutCRF. Derek Hoiem Carsten Rother John Winn. Goal 1: Object Description. Object Description: Bounding Box Viewpoint Color Pose Subclass. Goal 2: Object Segmentation. Key Idea. Combine object-level and pixel-level reasoning. Recognition Requires Object-Level Reasoning.
E N D
3D LayoutCRF Derek Hoiem Carsten Rother John Winn
Goal 1: Object Description • Object Description: • Bounding Box • Viewpoint • Color • Pose • Subclass
Key Idea • Combine object-level and pixel-level reasoning
Recognition Requires Object-Level Reasoning • Position • Shape/Size • Viewpoint/Pose • Style/Color
Solution: Window Detector? • 45 degree range of viewpoints • Minor scale/position variation
Recognition Requires Part-Level Reasoning • Propose good global model
Recognition Requires Part-Level Reasoning • Propose good global model • Occlusions
Context Requires Both Object and Part-Level Info • Size relationships require object model
Context Requires Both Object and Part-Level Info • Surface relationships require occlusion info Not visibly sitting on ground Visibly sitting on ground
Our Object/Part Model … T1 Tm Ti = { bounding box, viewpoint, color model, instance cost } h1 h2 h3 h4 hj object parts h5 h6 h7 h8 part consistency occlusions … … h9 h10 h11 hn x Extension from [Winn Shotton 2006]
Modeling Viewpoint Parameterized by Bounding Box and Corner
L F Training Annotation 3D Parts Model Assigning Parts from Model Training Image Assigned Parts
Relabeling • Allowing slight deformations, relabel training data Training Image Original Labels New Labels
Height Range Eight Viewpoint/Scale Ranges • Appearance (but not location) constant within each range
Modeling Part Appearance • Template patches (normalized xcorr) • Intensity / Color Image Edges (DT)
Modeling Part Appearance • Randomized decision trees • 25 trees, 250 leaf nodes • Once: • Learn structure on 50,000 object / 50,000 background pixels • For each appearance model: • Learn parameters on all pixels (850 LabelMe images)
Inference Input Image
Inference Input Image • Proposals • One per appearance model • Objects proposed by connected components
Proposal Stage Model • CRF Inference (TRW-BP) h1 h2 h3 h4 hi object parts h5 h6 h7 h8 part consistency occlusions … … h9 h10 h11 hn x
Inference Input Image Proposals • Refinement • One per proposal • Incorporate viewpoint, size information
Refinement Stage Model T1 Ti = { bounding box, viewpoint } h1 h2 h3 h4 hi object parts h5 h6 h7 h8 part consistency occlusions … … h9 h10 h11 hn x
Inference Input Image Proposals Refinement • Arbitration • Includes color model, instance penalty (graph cuts)
Preliminary Results on UIUC • Trained on 20, tested on rest • Quantitatively comparable to best
T1 h1 h2 h3 h4 h5 h6 h7 h8 … … h9 h10 h11 hn x Preliminary Results on UIUC Without Instance Cost With Instance Cost
Preliminary Results on PASCAL’06 • 25 images • One proposal (viewpoint within 45 degrees, scale of 26-38 pixels)
Preliminary Results on PASCAL’06 Without Color Model With Color Model
Conclusion • Combined object-level and pixel-level reasoning • Object-level: Position/Size, Viewpoint, Color • Pixel-level: Part appearance, Occlusion reasoning • Good preliminary results