1 / 25

Kinect Case S tudy

Kinect Case S tudy. CSE P 576 Larry Zitnick ( larryz@microsoft.com ). Motorized base. http://www.youtube.com/watch?v=dTKlNGSH9Po&feature=related. Depth. http://www.youtube.com/watch?v=inim0xWiR0o. http://www.youtube.com/watch?v=7TGF30-5KuQ&feature=related. Questions. Why a dot pattern ?

kristen
Download Presentation

Kinect Case S tudy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. KinectCase Study CSE P 576 Larry Zitnick (larryz@microsoft.com)

  2. Motorized base

  3. http://www.youtube.com/watch?v=dTKlNGSH9Po&feature=related

  4. Depth http://www.youtube.com/watch?v=inim0xWiR0o http://www.youtube.com/watch?v=7TGF30-5KuQ&feature=related

  5. Questions • Why a dot pattern? • Why a laser? • Why only one IR camera? • Is the dot pattern random? • Why is heat a problem? • How is it calibrated? • Why isn’t depth computed everywhere? • Would it work outside?

  6. Pose recognition • Research in pose recognition has been on going for 20+ years. • Many assumptions: multiple cameras, manual initialization, controlled/simple backgrounds

  7. Model-Based Estimation of 3D Human Motion,IoannisKakadiaris and Dimitris Metaxas, PAMI 2000

  8. Tracking People by Learning Their Appearance, Deva Ramanan, David A. Forsyth, and Andrew Zisserman, PAMI 2007

  9. Kinect • Why does depth help?

  10. Algorithm design Shotton et al. proposed two main steps: 1. Find body parts 2. Compute joint positions. Real-Time Human Pose Recognition in Parts from Single Depth Images Jamie Shotton Andrew Fitzgibbon Mat Cook Toby Sharp Mark Finocchio Richard Moore Alex Kipman Andrew Blake, CVPR 2011

  11. Finding body parts • What should we use for a feature? • What should we use for a classifier?

  12. Finding body parts • What should we use for a feature? • Difference in depth • What should we use for a classifier? • Random Decision Forests

  13. Features

  14. Classification Learning: Randomly choose a set of thresholds and features for splits. Pick the threshold and feature that provide the largest information gain. Recurse until a certain accuracy is reached or depth is obtained.

  15. Implementation details • 3 trees (depth 20) (why so few?) • 300k unique training images per tree. • 2000 candidate features, and 50 thresholds • One day on 1000 core cluster. • Why RDF and not AdaBoost, SVMs, etc.?

  16. Synthetic data

  17. Synthetic training/testing

  18. Real test

  19. Results

  20. Joint estimation • Apply mean-shift clustering to the labeled pixels. (why mean shift?) • “Push back” each mode to lie at the center of the part.

  21. Results

  22. Failures • Why would the system fail?

  23. Video • http://research.microsoft.com/pubs/145347/CVPR%202011%20-%20Final%20Video.mp4 Story about the making of Kinect: http://www.wired.co.uk/magazine/archive/2010/11/features/the-game-changer

More Related