270 likes | 574 Views
Random Forest and Graph Cut based segmentation of human limbs. Nadezhda Zlateva , IICT-BAS. 7 Sept. 2011. Outline. Human Pose Recognition Case Study Randomized Decision Tree Random Forest Experimental results with RF Graph Cut Experimental results with GC
E N D
Random Forest and Graph Cut based segmentation of human limbs Nadezhda Zlateva, IICT-BAS 7 Sept. 2011
Outline • Human Pose Recognition • Case Study • Randomized Decision Tree • Random Forest • Experimental results with RF • Graph Cut • Experimental results with GC • Application to hand classification • Conclusion • References
Human Pose Recognition Recognition via • conventional intensity cameras • depth cameras Frame to frame points tracking – slow to re-initialize Pose Recognition in parts: • Body parts segmentation - Per pixel classification • 3D skeletal joints estimation [1] Shotton et al., 11
Case Study • Upper limbs segmentation for hand gesture recognition • Application: • Sign language interpretation • Medical environments • Robots medical assistants • [Purdue University] • CT &MRIreview in sterileenvironments • [Sunnybrook Hospital, Toronto]
Binary Decision Tree: Basics v 1 ≥ 2 3 < 4 5 6 7 leaf nodes split nodes < 8 9 10 11 12 13 ≥ 14 15 16 17 category c
DT over depth images: Training feature vector – pixel x [x, y, z]T of depth image I split function – depth comparison features fθ as function of x: [1] Shotton, 11 dI(x) – depth at pixel x Combination of weak but computationally efficient features θ2 θ1
Randomized DT: Training • Random selection of a set of split candidates ϕ = (θ, τ), where - set of split thresholds for each θfor tree t. • Definition of theset of training pixels Q={(I,x)} over all training images for the tree t. Q - set of pixels at the root node. • Find best split candidate at node n – largest information gain from splitting QintoQleft& Qright
Randomized DT: Training • Recurse for Qleft(ϕ*) & Qright(ϕ*)–till reaching stop conditions • Maximum depth • Minimum information gain • Minimum number of node pixels • Estimation of Pt(c|I,x) at each leaf nodeover body part labels c – use normalized histogram Note: • dependent on choice of parameters • prone to over-fitting
Random Forest Forest - ensemble of T decision trees • Divide training (depth) images into T subsets – unique subset for each tree t • Train each tree [3] Breiman 01 [1] Shotton et al. 11
Random Forest: Classification • classification is x x tree t1 tree tT …… label c label c
Random Forest: Toy demo [2] Shotton et al. 09
Random Forest: Summary • Improves generalization to new data • Ensemble of trees gives robustness • Good for multi-class problems • Resistant to over-fitting • Fast training on large data sets • Efficient classifier
RF: Experiments and results - Ground truth: 500 (upper limb) labeled depth images (640x480) • Number of trees: T=3 • Tree depth: 15 - Split candidates: |θ|=100, |τ|=20 for each θ - Random pixels per image: 1000 - 5-fold cross validation => 100 test images, 130 training images per tree Table 1. Average per class accuracy with RF classification
RF: Experiments and results Ground truth & training Per pixel classification
Segmentation by Graph Cut: Motivation RF classification results: • Fuzzy body part boundaries • Left/Right uncertainty Subsequent hand sign recognition – requires cleaner hand region segmentation Graph Cut framework: • Energy minimization framework • Binary and multi-label image segmentation • Combines local and contextual information
Pixel labeling problem Given Pixels Assignment cost – U (unary potential) Separation cost – B (boundary potential) - pairs of neighboring pixels Find Labels that minimize [4] Boykov et al. 01
Graph Cut: Binary case • Image as directed graph G(V, E) t-link Assignment cost n-link Separation cost Energy minimization problem = min s-t cut on G = max-flow Theorem: In a graph G, the maximum source-to-sink flow possible is equal to the capacity of the minimum cut in G. [L. R. Foulds, Graph Theory Applications, 1992 Springer-Verlag New York Inc., 247-248]
Graph Cut: Multi-label case Energy = cut cost Suboptimal approximation of the minimum energy
Graph Cut: Potentials prior constraints Importance weight prob. by RF Energy function Unary potential , Boundary potential , [5] Boykov et al. 06
Graph Cut: Results Spatial Coherence:
Graph Cut: Results RF classifications GC segmentation
RF & GC for hands Ground truth 63 frames 500 random pixels |Omax| = 45 58.5% per class accuracy Random Forest 70.9% per class accuracy Graph Cut
Conclusion • RF – strong classifier • RF + GC over depth maps – good object segmentation Future Work • Increase available data • Improve pixel label inference • Estimate upper limb/hand joints • Recognize finger configuration
References [1] Shotton, J., A. FItzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, A. Blake. Real-time Human Pose Recognition in Parts from a Single Depth Image. CVPR, 2011 [2] Shotton, J. Boosting and Random Forest for Visual Recogniion, ICCV Tutorial, 2009. http://www.iis.ee.ic.ac.uk/~tkkim/iccv09_tutorial [3] Breiman, L. Random forests. Mach. Learning, 45(1):5–32, 2001. http://www.stat.berkeley.edu/~breiman/RandomForests [4] Boykov, Y., and M. P. Jolly. Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In Proc. IEEE Int. Conf. on Computer Vision, 2001. [5] Boykov, Y., and G. Funka-Lea. Graph cuts and efficient n-d image segmentation. IJCV, 70:109–131, 2006