DTAM: Dense Tracking and Mapping in Real-Time Newcombe, Lovegrove & Davison ICCV11

DTAM: Dense Tracking and Mapping in Real-Time Newcombe, Lovegrove & Davison ICCV11 Amaury Dame Active Vision Lab Oxford Robotics Research Group adame@robots.ox.ac.uk

Introduction Input : • Single hand held RGB camera Objective : • Dense mapping • Dense tracking Input image 3D dense map Amaury Dame Active Vision Lab Oxford Robotics Research Group 28.02.2013 Slide 2

System overview Amaury Dame Active Vision Lab Oxford Robotics Research Group 28.02.2013 Slide 3

Depth map estimation Principle: • S depth hypothesis are considered for each pixel of the reference image Ir • Each corresponding 3D point is projected onto a bundle of images Im • Keep the depth hypothesis that best respects the color consistency from the reference to the bundle of images Formulation: • : pixel position and depth hypothesis • : number of valid reprojection of the pixel in the bundle • : photometric error between reference and current image Amaury Dame Active Vision Lab Oxford Robotics Research Group 28.02.2013 Slide 4

Depth map estimation Reprojection in image bundle Example reference image pixel Photo error Depth hypotheses Reprojection of depth hypotheses on one image of bundle Amaury Dame Active Vision Lab Oxford Robotics Research Group 28.02.2013 Slide 5

Depth map filtering approach Problem: • Uniform regions in reference image do not give discriminative enough photometric error Idea: • Assume that depth is smooth on uniform regions • Use total variational approach where depth map is the functional to optimize: • photometric error defines the data term • the smoothness constraint defines the regularization. Amaury Dame Active Vision Lab Oxford Robotics Research Group 28.02.2013 Slide 6

Depth map filtering approach Formulation: • First term : regularization constraint, g is defined so that it is 0 for image gradients and 1 for uniform regions. So that gradient on depth map is penalized for uniform regions • Second term : data term defined by the photometric error. • Huber norm: differentiable replacement to L1 norm that better preserve discontinuities compared to L2. Amaury Dame Active Vision Lab Oxford Robotics Research Group 28.02.2013 Slide 7

Total variational optimisation L2 norm L1 norm QU(f1)=1 QU(f2)=0.1 QU(f3)=0.01 TV(f1)=1 TV(f2)=1 TV(f3)=1 Regularisation effect Image denoising [Pock08] Amaury Dame Active Vision Lab Oxford Robotics Research Group 28.02.2013 Slide 8

Depth map filtering approach Formulation : • Problem : optimizing this equation directly requires linearising of cost volume. Expensive and cost volume has many local minima. Approximation : • Introduce as an auxiliary variable, can be optimized with heuristic search • Second terms brings original and auxiliary variable together Amaury Dame Active Vision Lab Oxford Robotics Research Group 28.02.2013 Slide 9

Total variational optimisation Classical approaches: • Time Marching Scheme: steepest descent method • Linearization of the Euler-Lagrange Equation Problem: optimization badly conditioned as (uniform regions) Reformulation of regularization with primal dual method • Dual variable p is introduced to compute the TV norm: • Indeed: Amaury Dame Active Vision Lab Oxford Robotics Research Group 28.02.2013 Slide 10

Increasing solution accuracy ? Reminder: Before Approach: • Q well modeled, perform Newton step on Q to update estimation a • Equivalent to using Epsilon ? After one iteration Amaury Dame Active Vision Lab Oxford Robotics Research Group 28.02.2013 Slide 11

Dense tracking Inputs: • 3D texture model of the scene • Pose at previous frame Tracking as a registration problem • First inter-frame rotation estimation : the previous image is aligned on the current image to estimate a coarse inter-frame rotation • Estimated pose is used to project the 3D model into 2.5D image • The 2.5D image is registered with the current frame to find the current pose. Two template matching problems Amaury Dame Active Vision Lab Oxford Robotics Research Group 28.02.2013 Slide 12

SSD optimisation Problem: Align template image T(x) with input image I(x). Formulation: find the transformation that best maps the pixels of the templates into the ones of the current image minimizing: are the displacement parameters to be optimized. Hypothesis: Know a coarse approximation of the template position (p0). Amaury Dame Active Vision Lab Oxford Robotics Research Group 28.02.2013 Slide 13

SSD optimisation Problem: minimize The current estimation of p is iteratively updated to reach the minimum of the function. Formulations: • Direct additional • Direct compositional • Inverse Amaury Dame Active Vision Lab Oxford Robotics Research Group 28.02.2013 Slide 14

SSD optimisation Example: Direct additive method • Minimize : • First order Taylor expansion: • Solution: with: Amaury Dame Active Vision Lab Oxford Robotics Research Group 28.02.2013 Slide 15

SSD robustified Reminder: Problem:In case of occlusion, the occluded pixels cause the optimum of the function to be changed. The occluded pixels have to be ignored from the optimization Method : • Only the pixels with a difference lower than a threshold are selected. • Threshold is iteratively updated to get more selective as the optimization reaches the optimum. Amaury Dame Active Vision Lab Oxford Robotics Research Group 28.02.2013 Slide 16

Template matching Applications to DTAM: • First rotation estimation: the template is the previous image that is matched with current image. Warp is defined on the space of all rotations. The initial estimate of p is identity. • Full pose estimation template is 2.5D, warp is defined by full 3D motion estimation, that is . The initial pose is given by the pose estimated at the previous frame and the inter frame rotation estimation. Amaury Dame Active Vision Lab Oxford Robotics Research Group 28.02.2013 Slide 17

Conclusion • First live full dense reconstruction system... • Limitation from the smoothness assumption on depth... Amaury Dame Active Vision Lab Oxford Robotics Research Group 28.02.2013 Slide 18

Important references • [Pock Thesis08] Fast total variation for Computer Vision • [Baker IJCV04] Lucas-Kanade 20 years on: A unifying framework Amaury Dame Active Vision Lab Oxford Robotics Research Group 28.02.2013 Slide 19

DTAM: Dense Tracking and Mapping in Real-Time Newcombe, Lovegrove & Davison ICCV11