Multi-camera Tracking of Articulated Human Motion using Motion and Shape Cues

Aravind Sundaresan and Rama Chellappa Center for Automation Research University of Maryland, College Park MD USA Multi-camera Tracking of Articulated Human Motion using Motion and Shape Cues

What is motion capture? Motion capture (Mocap) is the process of analysing and expressing human motion in mathematical terms. Initialisation, Pose estimation and Tracking. Applications Motion Analysis for clinical studies, Human-computer interaction, Computer animation. Marker-based systems have shortcomings Cumbersome, introduce artefacts, time consuming. Marker-less system desirable.

Calibration and Human body model Use multiple cameras (8) in our capture 640x480 grey scale images at 30 fps. Calibrated using algorithm of Svoboda. Use articulated human body model. Super-quadrics for body segments. Model described by joint locations and super-quadrics. Pose is described by joint angles.

Overview Use images from multiple cameras. Compute 2-D pixel displacement between t and t+1. Predict 3-D pose at t+1 using pixel displacement. Compute spatial energy function as function of pose. Minimise energy function to obtain pose at t+1.

Tracking Framework Use motion and spatial cues for tracking. Motion cues use texture. Error accumulation: estimates only change in pose. Spatial cues obtained from silhouettes, edges, etc. Instability: Solutions are stable only “locally”. Predictor-Corrector framework. Predictor: Compute motion(t) from pixel displacement. Predict pose(t+1) from pose(t) and motion(t). Corrector: Assimilate spatial cues into single energy function. Correct pose(t+1) by minimising energy function.

Pixel registration and displacement Project model onto image to obtain Body part label for pixel. 3-D location of pixel. Mask for each body part Find dense pixel correspondence using Parametric optical flow-based algorithm for each segment.. Minimise MSE:

Pose from pixel displacement State-space formulation • Linearisation • We show that • Taylor series • Iteratively estimate pose

Combine spatial cues Combine multiple spatial cues into a single “spatial energy function”. Compute pose energy as function of dx, dy and Φ. + =

Minimise 3D pose energy Given multiple views and 3-D pose Compute 2-D pose for ith image Compute Ei for ith camera using 2-D pose 3D pose energy, E = E1+ E2 + ... + En Compute minimum energy pose using optimisation.

Tracking results

Multi-camera Tracking of Articulated Human Motion using Motion and Shape Cues

Multi-camera Tracking of Articulated Human Motion using Motion and Shape Cues

Presentation Transcript

Mobile Motion Tracking using Onboard Camera

Motion Tracking

Animating (human) motion

Motion tracking

Parametric Motion and Tracking

Motion Tracking HMD

Real-Time Decentralized Articulated Motion Analysis and Object Tracking From Videos

Facial Motion Cloning Using Global Shape Deformation

Mobile Motion Tracking using Onboard Camera

Mobile Motion Tracking using Onboard Camera

Multiple Camera Tracking of Interacting and Occluded Human Motion

Motion Tracking

Tracking and Motion

Motion Tracking HMD

Human Motion Analysis

Motion Tracking

Implicit Probabilistic Models of Human Motion for Synthesis and Tracking

Motion tracking

Motion Control Camera