1 / 42

Shape-Based Human Detection and Segmentation via Hierarchical Part-Template Matching

Shape-Based Human Detection and Segmentation via Hierarchical Part-Template Matching. Zhe Lin, Member, IEEE Larry S. Davis, Fellow, IEEE IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLGENCE, APRIL 2010. Overview. Introduction Previous Work Proposed Approach

ataret
Download Presentation

Shape-Based Human Detection and Segmentation via Hierarchical Part-Template Matching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Shape-Based Human Detection and Segmentation via Hierarchical Part-Template Matching Zhe Lin, Member, IEEE Larry S. Davis, Fellow, IEEE IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLGENCE, APRIL 2010

  2. Overview • Introduction • Previous Work • Proposed Approach • Hierarchical Part-Template Matching • Pose-Adaptive Descriptors • Combining With Calibration And Background Subtraction • Experiment Result • Conclusion

  3. Overview • Introduction • Previous Work • Proposed Approach • Hierarchical Part-Template Matching • Pose-Adaptive Descriptors • Combining With Calibration And Background Subtraction • Experiment Result • Conclusion

  4. Introduction • Robust Human tracking and identification are highly dependent on reliable human detection and human segmentation. • Remains challenging due to several conditions like body postures, illumination, occlusion, and viewpoint changes. • Goal: Develop a robust and efficient approach to detect and segmentation. • Method: Shape-based, part-template matching

  5. Overview • Introduction • Previous Work • Proposed Approach • Hierarchical Part-Template Matching • Pose-Adaptive Descriptors • Combining With Calibration And Background Subtraction • Experiment Result • Conclusion

  6. Previous Work • Shape Feature extraction schemes • Model human shapes globally [1],[2],[3] • Model shapes using sparse local features [9],[10],[11] • Learning Perspective • Generative approach – tree-based data structure [6],[7],[8] • Discriminative approach – using SVMs as the test classifiers [3] • Surveillance scenarios • Motion blob information [35],[36]

  7. Overview • Introduction • Previous Work • Proposed Approach • Hierarchical Part-Template Matching • Pose-Adaptive Descriptors • Combining With Calibration And Background Subtraction • Experiment Result • Conclusion

  8. Proposed Approach • Hierarchical part-template matching approach combining with discriminative learning.

  9. Overview • Introduction • Previous Work • Proposed Approach • Hierarchical Part-Template Matching • Pose-Adaptive Descriptors • Combining With Calibration And Background Subtraction • Experiment Result • Conclusion

  10. Hierarchical Part-Template Matching • Generating the part-template tree model • Synthesizing global shape models • Generating parts by decomposition • Constructing an initial tree model using parts • Learning the part-template tree • Hierarchical part-template matching

  11. Synthesizing Global Shape Models • Analyzing articulation of human body to six regions • Head, torso, pair of upper legs, pair of lower legs • Parameter above are quantized into {3,2,3,3,3,3}

  12. Generating Parts by Decomposition • Binarize (a) and to obtain (b), then extract boundaries of the silhouettes to get (c). • Silhouettes are decomposed into three parts(head-torso, upper legs, and lower legs) • The parameters of silhouettes are denoted by θj, consist of index and location

  13. Constructing an Initial Tree Model Using Parts • A part-template tree is conducted by placing the decomposed part region or fragment into a tree. • Four layer L0~L3, denote root, head-torso, upper and lower legs separately. • Tree consists of 186 part-template. (6 ht models, 18 ul models, and 162 ll models) • Much larger set only slightly improves in performance. • Applying fast hierarchical shape matching scheme.

  14. Constructing an Initial Tree Model Using Parts

  15. Learning the Part-Template Tree • The tree doesn’t contain any prior statistics from real human silhouettes. • The learning is performed by matching the tree to a set of real human silhouette images. • The goal is to explicitly estimate branching probability distributions (conditional probability distributions).

  16. Learning the Part-Template Tree • Learning method: • The training silhouette is passed through the tree from root to estimate the matching score and find the optimal path. • Based on the set of paths, a branching probability distribution is estimated for each node. • Each node contains a binary image of the part-template, its sample point coordinates, and a branching probability.

  17. Hierarchical Part-Template Matching • Similarly to the model used for tree learning. • The overall matching score for a detection window is simply modeled as a summation of scores of all nodes along the path. • Score of node is the product of the part-template matching score and the probability of the node. • Matching method is similar to Chamfer matching [6]. • The matching score of a sample point on the contour is measured by edge-orientation matching to find the optimal human pose. [6] D.M. Gavrila and V. Philomin, “Real-Time Object Detection for SMART Vehicles,” Proc. IEEE

  18. Overview • Introduction • Previous Work • Proposed Approach • Hierarchical Part-Template Matching • Pose-Adaptive Descriptors • Combining With Calibration And Background Subtraction • Experiment Result • Conclusion

  19. Pose-Adaptive Descriptors • Introduce a pose-adaptive feature computation method for detecting human from images using SVM. • By similar method of HOG descriptor[3] getting object detection window. • After given the candidate detection window, hierarchical part-template matching is performed to estimate the optimal pose. • After the pose is estimated, block features closest to each pose contour point are collected. [3] N. Dalal and B. Triggs, “Histograms of Oriented Gradients for Human Detection,” Proc. IEEE Conf.

  20. Pose-Adaptive Descriptors

  21. Low-Level Features • Similar to [3] • Given an image, calculate gradient magnitudes |G| and edge orientation O • Quantize the image into 8x8 nonoverlapping cells, each represent a histogram of edge orientations.

  22. Pose Inference on The Low-Level Features • An optimal tree path is estimated based on the matching score. • Among matching score, the part-template score is measured by an average of gradient magnitude. • Matching score (1), where B(t) = [O(t)/(π/9)], h is the orientation histogram • The average score of the part-template is (2)

  23. Representation Using Pose-Adaptive Descriptors • The global shape models are represented as a set of boundary points with corresponding edge orientations.

  24. Overview • Introduction • Previous Work • Proposed Approach • Hierarchical Part-Template Matching • Pose-Adaptive Descriptors • Combining With Calibration And Background Subtraction • Experiment Result • Conclusion

  25. Scene-to-Camera Calibration • To obtain a mapping between head points and foot points in the image, estimate the homography between the head plane and the foot plane in the image. • Get head point ph= f(pf), where pf is an arbitrary point of foot.

  26. Combining With Background Subtraction • Find foot regions Rfoot = {x|ϒx≥ξ} • Through part-template matching finding regions that may be legs. • Given the estimated human vertical axis vxand an adaptive rectangular window W(x,(w0,h0)), get human detection. • Get human segmentation.

  27. Combining With Calibration and Background Substraction

  28. Overview • Introduction • Previous Work • Proposed Approach • Hierarchical Part-Template Matching • Pose-Adaptive Descriptors • Combining With Calibration And Background Subtraction • Experiment Result • Conclusion

  29. Experiment Result • Present result of human detector using their method on two public pedestrian data sets (INRIA and MIT-CBCL). • Present result of multiple occluded human detector on three crowded image and video data set. • Compare with other approaches using DET curves.

  30. Experiment of Detection Result

  31. Experiment of Detection Result • Better performance than HOG-SVM. • Not only detecting but also segmenting human poses. • Can be further improved because of capability of being extended to cover more pose or articulations. • Successfully detected difficult poses while the HOG-based detector missed.

  32. Experiment of Detection Result

  33. Experiment of Detection Result

  34. Experiment of Segmentation Result • Using pose model and probabilistic hierarchical part-template matching algorithm give very accurate segmentation in the MIT-CBCL and INRIA data set.

  35. ExperimentWithout Subtraction

  36. Experiment Without Subtraction

  37. Experiment With Subtraction • Data set • Caviar Benchmark data set • Munich Airport data set collected by Siemens Corporate Research • Can get good result even with poor and inaccurate background subtraction.

  38. Experiment With Subtraction

  39. Experiment With Subtraction

  40. Overview • Introduction • Previous Work • Proposed Approach • Hierarchical Part-Template Matching • Pose-Adaptive Descriptors • Combining With Calibration And Background Subtraction • Experiment Result • Conclusion

  41. Conclusion • A hierarchical part-template matching approach is employed to match human shapes with images detect and segment simultaneously. • Many of misdetections are due to the pose estimation failures. • Future work • Investigating the addition of color and texture statistics to the local contextual descriptor to improve the detection and segmentation performance.

More Related