530 likes | 692 Views
Articulated Human Detection . Student: Yao-Sheng Wang Advisor: Prof. Sheng- Jyh Wang. Department of Electronics Engineering National Chiao Tung University. Hsinchu, Taiwan. Vision Lab 2012. 1. Outline. Introduction Related Works Idea Proposed Method Experimental Results
E N D
Articulated Human Detection Student: Yao-Sheng Wang Advisor: Prof. Sheng-Jyh Wang Department of Electronics Engineering National Chiao Tung University Hsinchu, Taiwan Vision Lab 2012 1
Outline Introduction Related Works Idea Proposed Method Experimental Results Conclusion Reference
Outline • Introduction • Motivation • Challenge • Representative Works • Potential Problems • Target • Related Works • Idea • Proposed Method • Experimental Results • Conclusion • Reference
Motivation • Why we care about human detection? • We are human beings! • Wide range of applications: • Automotive safety • Surveillance system • Indoor care • Crime alert • Human-Computer Interface … etc.
Outline • Introduction • Motivation • Challenge • Representative Works • Potential Problems • Target • Related Works • Idea • Proposed Method • Experimental Results • Conclusion • Reference
Challenge • What makes human detection so difficult? • Illumination condition • Cluttered background • Change of viewpoints • Occlusion • Wearing difference • Diversity of human • Pose variation
Challenge • What makes human detection so difficult? • Illumination condition • Cluttered background • Change of viewpoints • Occlusion • Wearing difference • Diversity of human • Pose variation
Challenge • What makes human detection so difficult? • Illumination condition • Cluttered background • Change of viewpoints • Occlusion • Wearing difference • Diversity of human • Pose variation
Challenge • What makes human detection so difficult? • Illumination condition • Cluttered background • Change of viewpoints • Occlusion • Wearing difference • Diversity of human • Pose variation
challenge • Progress on “Machine Learning” technology • Handle more general and complicate cases. • Definition: • “Articulated Human Detection”.
Outline • Introduction • Motivation • Challenge • Representative Works • Potential Problems • Target • Related Works • Idea • Proposed Method • Experimental Results • Conclusion • Reference
Representative works (I) [P. Felzenszwalb, D. McAllester, and D. Ramanan. A discriminatively trained, multi-scale, deformable part model. In CVPR, 2008.] • Deformable Part Model • Root filter (mask). • Part filter (mask). • Penalty function.
Representative works (II) [LubomirBourdev, Jitendra Malik. Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations. In ICCV, 2009.]. Pose-let:
Outline • Introduction • Motivation • Challenge • Representative Works • Potential Problems • Target • Related Works • Idea • Proposed Method • Experimental Results • Conclusion • Reference
Potential Problems • Problems: • System complexity increased with the complexity of human poses. • More detectors needed. • Exhaustive search. • Sliding window method + Image pyramid. • Both problems leads to unacceptable speed for applications in real life.
Outline • Introduction • Motivation • Challenge • Representative Works • Potential Problems • Target • Related Works • Idea • Proposed Method • Experimental Results • Conclusion • Reference
Target • Target in the thesis: • Propose a detection scheme with acceptable detection speed in dealing with highly intra- class variation from the change of pose and viewpoint.
Outline Introduction Related Works Idea Proposed Method Experimental Results Conclusion Reference
Related works • Better features: • Cheap to compute and capture crucial information at the same time. Ex: HOG. • Better classifiers: • Linear classifiers. • Ex: Adaboost, Linear-SVM and Random-forests. • Better prior knowledge: • Ex: Information about ground plane.
Related works [P. Felzenszwalb, R. Girshick, D. McAllester. Cascade Object Detection with Deformable Part Models. In CVPR, 2010.] • Cascades: • Cascade the part filters to reduce the searching regions.
Related works Start • Discard non-promising hypotheses. • Class-dependent: • Branch and bound. (CVPR, 2008) • Class-independent: • What is an object? (CVPR, 2010) • Closure boundary, different appearance or salience. • Segmentation as selective search. (ICCV, 2011)
Related works [P. Dollár, S. Belongie, P. Perona. The fastest pedestrian detector in the west. In BMVC, 2010.] [R. Benenson, M. Mathias, R. Timofte, and L. Van Gool. Pedestrian detection at 100 frames per second. In CVPR, 2012.] • Feature response approximation: • Feature approximation in testing step. • Feature approximation in training step.
Outline Introduction Related Works Idea Proposed Method Experimental Results Conclusion Reference
IDEA • Recall the memory of the first problem: • System complexity increased with the complexity of human poses (include variation of viewpoints). • How can we break the relation between the complexity of system and the one of human poses? • Choose stable features or body parts for detection.
Idea Better prior knowledge:
IDEA • Recall the memory of the second problem: • Exhaustive search. • “Sliding Window” + “Image Pyramid”. • How can we reduce the searching region? • Detect the common feature among these parts. • Use the cumulative characteristic of the feature to handle the variation of scale.
Idea • Common feature • Body parts consist of combination of two edge segments. • Cumulative characteristic • Edge detector with fixed size + Combination.
Comparison • The previous works focus on reducing the searching regions. • Specifically against “Exhaustive Search”. • Our method starts from breaking the relation between complexity of system and that of poses. Then, use the common feature and cumulative characteristic to cut down the searching space.
Outline Introduction Related Works Idea Proposed Method Experimental Results Conclusion Reference
System block • Bottom-up system:
System block • Bottom-up system:
Fast Part detection • Steps: • Detection of edge candidates. • Production of part candidates. • Refinement of part candidates.
Detection of part candidates Detection and combination of segments(9 orientations).
Production of part candidates Neighbor orientation consideration • Constraints on combination of edges. • Orientation, length ratio and color symmetry.
Refinement of part candidates Feature = [Length Orientation HOG_features] feature134 feature400 feature2 feature33 ? ? HOG feature + Random forest training
System block • Bottom-up system:
Part combination • Problem: • No information about the classes of the limbs due to the low resolution of images or variation from hand gestures or appearance of shoes...etc. • Need another step to refine the combinations. • What information left? • Head-shoulder or head-torso.
Part combination Any possibility for us to estimate the position and orientation of head-torso based on the architecture of current combinations?
Part combination • Problem: • How to select body parts belong to specific human from lots of part candidates? • Too much possibilities for exhaustive search.
Part combination • Clues for reducing the number of possible combinations. • Center distance, length ration or width ratio between two parts. • Combination with the number of parts more than four.
Part combination • Conclusion for the clues mentioned in the previous slide. • Too complicate to combine the parts for the whole body. • Start from low-level combination of parts to reveal the benefits of physical constraints. • Break the problems into two levels. • Low-level combination. • High-level combination.
Low-Level combination • How far can we reach for low-level combination? • 4-parts combination = lower body.
Low-level combination feature134 feature400 ? ? feature2 feature33 False alarm exists. Joints relative position + Random Forest
High-Level combination • Combination between the arms, legs, lower bodies and uncombined single parts from the low-level combination step. • Upper bound of the number of combination:
System block • Bottom-up system:
Combination refinement Pose prediction. Detection with DPM detector.
Pose prediction • Feature: • Relative size ratio and positions between low-level combinations and architecture of each low-level combination. • Random Forest.
Detection with DPm detector Use DPM detector to cover the intra-class variation. Model:
Usage of head-shoulder information • Much stronger than information of limbs. • Head-shoulder to head-torso. • Start from head-torso to combine limbs back.
System illustration • Pose Prediction • Head-Torso Detection Edge Candidates Part Candidates Part Detector Parts Low Level Part Combine Low Level Combination High Level Part Combine High Level Combination Result of Detection