Toward Object Discovery and Modeling via 3-D Scene Comparison

Toward Object Discovery and Modeling via 3-D Scene Comparison Evan Herbst, Peter Henry, XiaofengRen, Dieter Fox University of Washington; Intel Research Seattle

Overview • Goal: learn about an environment by tracking changes in it over time • Detect objects that occur in different places at different times • Handle textureless objects • Avoid appearance/shape priors • Represent a map with static + dynamic parts

Algorithm Outline • Input: two RGB-D videos • Mapping & reconstruction of each video • Interscene alignment • Change detection • Spatial regularization • Outputs: reconstructed static background; segmented movable objects

Scene Reconstruction • Mapping based on RGB-D Mapping [Henry et al. ISER’10] • Visual odometry, loop-closure detection, pose-graph optimization, bundle adjustment

Scene Reconstruction • Mapping based on RGB-D Mapping [Henry et al. ISER’10] • Surface representation: surfels

Scene Differencing • Given two scenes, find parts that differ • Surfaces in two scenes similar iff object doesn’t move • Comparison at each surface point

Scene Differencing • Given two scenes, find parts that differ • Comparison at each surface point • Start by globally aligning scenes (2-D) (3-D)

Naïve Scene Differencing • Easy algorithm: closest point within δ→ same • Ignores color, surface orientation • Ignores occlusions

Scene Differencing • Model probability that a surface point moved • Sensor readings z • Expected measurement z* • m ϵ {0, 1} z3 z2 z1 frame 49 frame 25 z0 z* frame 10 frame 0

Sensor Models • Model probability that a surface point moved • Sensor readings z;expected measurement z* • By Bayes, • Two sensor measurement models • With no expected surface: • With expected surface:

Sensor Models • Two sensor measurement models • With expected surface • Depth: uniform + exponential + Gaussian 1 • Color: uniform + Gaussian • Orientation: uniform + Gaussian zd* 1Thrunet al., Probabilistic Robotics, 2005

Sensor Models • Two sensor measurement models • With expected surface • Depth: uniform + exponential + Gaussian 1 • Color: uniform + Gaussian • Orientation: uniform + Gaussian • With no expected surface • Depth: uniform + exponential • Color: uniform • Orientation: uniform zd* 1Thrunet al., Probabilistic Robotics, 2005

Example Result Scene 1 Scene 2

Spatial Regularization • Points treatedindependently so far • MRF to label each surfel moved or not moved • Data term given by pointwise evidence • Smoothness term: Potts, weighted by curvature

Spatial Regularization • Points treatedindependently so far • MRF to label each surfel moved or not moved Scene 1 Scene 2 regularized pointwise

Experiments • Trained MRF on four scenes (1.4M surfels) • Tested on twelve scene pairs (8.0M surfels) • 70% error reduction wrt max-class baseline Baseline Ours

Experiments • Results: complex scene

Experiments • Results: large object

Conclusion • Segment movable objects in 3-D using scene changes over time • Represent a map as static + dynamic parts • Extensible sensor model for RGB-D sensors • Next steps • All scenes in one optimization • Model completion from many scenes • Train more supervised object segmentation

Using More Than 2 Scenes • Given our framework, pretty easy to combine evidence from multiple scenes: • wscene could be chosen to weight all scenes (rather than frames) equally, or upweight those taken under good lighting • Other ways to subsample frames: as in keyframe selection in mapping

First Sensor Model: Surface Didn’t Move • Modeling sensor measurements: • Depth: uniform + exponential + Gaussian * • Color, normal: uniform + Gaussian; mixing controlled by probability that beam hit expected surface zd* * Fox et al., “Markov Localization…”, JAIR ‘99

Experiments • Trained MRF on four scenes (2.7 Msurfels) • Tested on twelve scene pairs (8.0 Msurfels) • 250k moved surfels; we get 4.5k FP, 51k FN • 65% error reduction wrt max-class baseline • Extract foreground segments as “objects”

Overview • Many visits to same area over time • Find objects by motion

(extra) Related Work • Prob. Sensor models • Depth only • Depth & color, extra indep. Assumptions • Static + dynamic maps • In 2-d • Usually not modeling objs

Spatial Regularization • Pointwise only so far • MRF to label each surfel moved or not moved • Data termgiven by pointwise evidence • Smoothness term: Potts, weighted by curvature

Depth-Dependent Color/Normal Model • Modeling sensor measurements: • Combine depth/color/normal:

Scene Reconstruction • Mapping based on RGB-D Mapping [Henry et al. ISER’10] • Surface representation: surfels

Toward Object Discovery and Modeling via 3-D Scene Comparison

Toward Object Discovery and Modeling via 3-D Scene Comparison

Presentation Transcript

Molecular Modeling and Drug Discovery

6.870 Object Recognition and Scene Understanding

3-D Scene Analysis via Sequenced Predictions over Points and Regions

Scene Modeling

3-D Modeling

Unsupervised object discovery via self- organisation

Modeling and Meshing for 3-D Analyses

Haplotype Discovery and Modeling

3-D scene modeling and applications in optical and thermal remote sensing

COSC 410 3-D Modeling

Knowledge Modeling and Discovery

6.870 Object Recognition and Scene Understanding

Scene modeling, Mazes, and Terrain

Object Modeling

Object Modeling

String and Object Comparison

Modeling Scene and Object Contexts for Human Action Retrieval with Few Examples

Object Modeling

Via Topology for 3-D Integration

3-D Object Recognition From Shape

3-D Object Modeling

Comparison opening scene