1 / 20

CNN-Based Action Recognition Using Adaptive Depth Motion Maps

This research introduces a method for action recognition utilizing Adaptive Multiscale Depth Motion Maps (AM-DMMs) and Stable Joint Distance Maps (SJDMs) generated from depth data. By leveraging the advantages of 3D structural information and complementary modalities, the proposed CNN-based approach enhances recognition performance. Through input preprocessing, network training, and class score fusion, the model captures spatio-temporal information efficiently. Experiments demonstrate the effectiveness of the method in recognizing actions by integrating motion cues and spatial relationships with high accuracy.

pmiranda
Download Presentation

CNN-Based Action Recognition Using Adaptive Depth Motion Maps

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CNN-based Action Recognition Using Adaptive Multiscale Depth MotionMaps And Stable Joint Distance Maps GlobalSIP 2018 Nov. 27, 2018 Junyou He, HailunXia, Chunyan Feng, Yunfei Chu Beijing University of Posts and Telecommunications

  2. OUTLINE • Motivations • The Proposed Method: • Adaptive Multiscale Depth Motion Maps(AM-DMMs) • Stable Joint Distance Maps(SJDMs) • Input Preprocessing • Network Training &Class Score Fusion • Experiments Results • Conclusions • CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTIONMAPS AND STABLE JOINT DISTANCE MAPS

  3. Motivations • Advantages of depth modality: • providing 3D structural information • insensitive to variations in lighting • Contains significant flicker noises But Depth map Skeleton data : more robust to noise But • Not always reliable Each modality can capture a certain kind of information that is likely to be complementary to the other Thus, integrating the information from depth and skeleton is expected to improve the recognition performance • CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTIONMAPS AND STABLE JOINT DISTANCE MAPS

  4. Motivations Action Recognition The Spatio-Temporal Information Key Handcrafted features: SIFT, color histogram, edge direction … • Domain knowledge • Shallow & Dataset-dependent But • Difficult to memorize the entire sequence information • Difficult to extract high-level features But RNN-based methods Thus, we propose a compact and effective CNN based method to capture the spatio-temporal information . • CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTIONMAPS AND STABLE JOINT DISTANCE MAPS

  5. The Proposed Method Generate AM-DMMs Generate SJDMs Network Training & Class Score Fusion Input Preprocessing • CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTIONMAPS AND STABLE JOINT DISTANCE MAPS

  6. Adaptive Multiscale Depth Motion Maps(AM-DMMs) • To capture more details of shape and motion information and cope with speed variations in actions Suffer from loss of temporal information AM-DMMs DMMs • capture the detailed motion cues • cope with • speed variations • CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTIONMAPS AND STABLE JOINT DISTANCE MAPS

  7. Adaptive Multiscale Depth Motion Maps(AM-DMMs) AM-DMMs generated from a sample video of the action Swipeleft on three views • The motion energy E(i) of ithframe • DMM of a depth video sequence with N frames • returns the number of non-zero elements in a binary map • represents frame index • is the projected map of • Frame under projection view • CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTIONMAPS AND STABLE JOINT DISTANCE MAPS

  8. The Proposed Method Generate AM-DMMs Generate SJDMs Network Training & Class Score Fusion Input Preprocessing • CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTIONMAPS AND STABLE JOINT DISTANCE MAPS

  9. Stable Joint Distance Maps(SJDMs) • To avoid excessive noise, three reference joints which • are stable in most actions are used to compute relative distances of the other joints • The Euclidean distance at • frame t • is the joint indices • is one of three stable joints • CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTIONMAPS AND STABLE JOINT DISTANCE MAPS

  10. Stable Joint Distance Maps(SJDMs) • The distances to different stable joints contain different spatial relationships and useful structural information of the skeleton • corresponding to • is expressed as follows: • CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTIONMAPS AND STABLE JOINT DISTANCE MAPS

  11. The Proposed Method Generate AM-DMMs Generate SJDMs Network Training & Class Score Fusion Input Preprocessing • CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTIONMAPS AND STABLE JOINT DISTANCE MAPS

  12. Input Preprocessing • Resized maps to make them compatible with the pre-trained CNN model and solve the variable-length problem • HSV-color coding has highlighted the differences in texture and edges Sample color coded AM-DMMs and SJDMs generated by the proposed method on UTD-MHAD dataset • CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTIONMAPS AND STABLE JOINT DISTANCE MAPS

  13. The Proposed Method Generate AM-DMMs Generate SJDMs Network Training & Class Score Fusion Input Preprocessing • CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTIONMAPS AND STABLE JOINT DISTANCE MAPS

  14. Network Training and Class Score Fusion • A multi-channel CNN is adopted to exploit the discriminative features • Two fusion methods are expressed as follows • are score probability vectors • is the element-wise multiplication • are the accuracy of the corresponding network • is a function to find the index of the element having the maximum score • CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTIONMAPS AND STABLE JOINT DISTANCE MAPS

  15. Experiments: dataset • UTD-MHAD dataset:multimodalaction dataset UTD-MHAD dataset: contains 27 different actions and each action is performed by 8 subjects (4 females and 4 males) and with up to 4 repetitions • CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTIONMAPS AND STABLE JOINT DISTANCE MAPS

  16. Experiments Result • The effectiveness of different schemes and the results of individual CNN and two fusion methods Comparisons of the different scheme on UTDMHAD dataset • CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTIONMAPS AND STABLE JOINT DISTANCE MAPS

  17. Experiments Result • The performance of the proposed method and the results reported before on UTD-MHAD dataset • CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTIONMAPS AND STABLE JOINT DISTANCE MAPS

  18. Experiments Result • Confusion matrix of proposed method on the UTD-MHAD dataset • CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTIONMAPS AND STABLE JOINT DISTANCE MAPS

  19. Conclusions • Presents an effective method for action recognition using a nine-channel CNN • The fusion of depth and skeleton modalities is proposed to improve the classification accuracy • The proposed AM-DMMs capture more shape clues and details of motion. • Transform one skeleton sequence into three SJDMs which describe different spatial relationships between joints • CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTIONMAPS AND STABLE JOINT DISTANCE MAPS

  20. Thanks! Junyou He @BUPT 12211006@bupt.edu.cn • CNN-BASED ACTION RECOGNITION USING ADAPTIVE MULTISCALE DEPTH MOTIONMAPS AND STABLE JOINT DISTANCE MAPS

More Related