200 likes | 350 Views
Real-Time Human Pose Recognition in Parts from Single Depth Images . Speaker DengLei At I-VisionGroup. Outline. Introduction Data Algorithm Experiments. Introduction —Human Body Tracking. Apps Game, HCI, Security, telepresence , health-care etc. Depth Camera Kinect
E N D
Real-Time Human Pose Recognition in Parts from Single Depth Images Speaker DengLei At I-VisionGroup
Outline • Introduction • Data • Algorithm • Experiments
Introduction—Human Body Tracking • Apps • Game, HCI, Security, telepresence, health-care etc. • Depth Camera • Kinect • full range shapes, sizes and motions • Existing Sys • Fast tracking but slow re-initialize • Our Sys • Per-frame initialization • For temporal, kinematic coherence tracking • For initialization and recovering from failure
Show • Final Video Show
General Description • Goals • Fast • Robust • Steps: • Single depth image =>dense probabilistic body part labeling near joint • Reprojecting, Generating confidence-weighted 3D joint • Strategy • Per-pixel classification, Independent Evaluation • Synthetic depth images • Deep randomized decision forest classifier with 10^5 samples • Simple 3D invariance feature distance, GPU • Mean shift for 3D joint proposals
Testing Environment • Xbox 360 GPU • 5ms per frame • One order faster than existing • Evaluate on real and synthetic samples
Contributions • Main Contrib. • Treat pose estimation as object recognition with BODY PARTS REPRESENTATION near joint • Low computational cost and high accuracy • Insights • Synthetic data is good proxy for real • Scaling up with synthetic data is important
Outline • Introduction • Data • Algorithm • Experiments
Data Description • Lack of training data • Computer graphics hampered by color, texture variability caused by clothing, hair, skin. • Limitation of mocap • Depth imaging • Kinect depth camera: 640x480_30, err: 10^-2m • Low light • color texture invariant • resolving silhouette ambiguities in pose • Simplify background substraction • Synthesize realistic depth images of people and build large dataset cheaply
Motion Capture data • Capture real data • 500 k • Not record • Rotation about vertical axis • Mirroring left-right • Scene position and camera pose • Shapes and size • Furthest neighbor • Reduce redundant • Finally • 100k subset, distance < 5cm
Generating synthetic data • Randomize rendering pipeline • Goal: realism and variety • Cg + random parameters • 15 base body meshes spanning shapes and sizes • Camera pose& noise, clothing & hair style etc • Compare
Outline • Introduction • Data • Algorithm • Experiments
Body Part Inference • Labeling • Key contribution: body part representation • Color-coded with joints and gaps • 31 body parts, small: accurate numerous: waste classifier • Adjust to application • Features • Simple • 5 arithmetic • 3 image reading
Random Forest • Classification • Training • On different set of synthesized images, 2000 pixels on each. • Random set of • Left and Right • Largest Gain • Terminal Cond • Large gain, Small depth • Recursing for left and right • 3 trees, depth 20, 10^6 images, 1 day on 1000 core cluster
Joint proposals • Mean shift with Gaussian kernel • Discard outlying • Smooth • Threshold • Mean shift start from c • Parameter desicion • Grid search on 5000 images
Outline • Introduction • Data • Algorithm • Experiments
Experiments • Paras • 3 trees, 20 deep, 300k images per tree, 2000 pixels per image • 2000 candidate 50 candidateper feature