230 likes | 350 Views
Gesture recognition using salience detection and concatenated HMMs. Ying Yin yingyin@csail.mit.edu Randall Davis davis@csail.mit.edu Massachusetts Institute of Technology. System overview. Feature vector sequence. Depth & RGB images. Hand tracking. Hand movement segmentation. Xsens
E N D
Gesture recognition using salience detection and concatenated HMMs Ying Yin yingyin@csail.mit.edu Randall Davis davis@csail.mit.edu Massachusetts Institute of Technology
System overview Feature vector sequence Depth & RGB images Hand tracking Hand movement segmentation Xsens data Feature vector sequence with movement Gesture spotting & recognition
System overview Feature vector sequence Depth & RGB images Hand tracking Hand movement segmentation Xsens data Feature vector sequence with movement Gesture spotting & recognition
Hand tracking • Kinect skeleton tracking is less accurate when the hands are close to the body or move fast • We use both RGB and depth information • Skin • Gesture salience (motion and closeness to the observer)
Input to recognizer • Feature vector xt • From the Kinect data and hand tracking • Relative position of the gesturing hand with respect to shoulder center in world coordinate (R3) • From the Xsens unit on the hand • Linear acceleration (R3) • Angular velocity (R3) • Euler orientation (yaw, pitch, roll) (R3)
System overview Feature vector sequence Depth & RGB images Hand tracking Hand movement segmentation Xsens data Feature vector sequence with movement Gesture spotting & recognition
Hand movement segmentation • Part of gesture spotting • Train Gaussian models for rest and non-rest positions • During recognition, an observation xt is first classified as a rest or a non-rest position • It is a non-rest position if
System overview Feature vector sequence Depth & RGB images Hand tracking Hand movement segmentation Xsens data Feature vector sequence with movement Gesture spotting & recognition
Continuous gesture models Pre- stroke Post-stroke Nucleus Rest End
Continuous gesture models Pre- stroke Post-stroke Nucleus Rest End
Continuous gesture models Pre- stroke Post-stroke Nucleus Rest End
Bakis model for nucleus phase • 6 hidden states per nucleus phase in the final model • Emission probability: mixture of Gaussians with 6 mixtures s1 s2 s3 s4 s5 s6 start p(END|s6) p(s1)
Concatenated HMMs • Train an HMM for each phase for each gesture • Model termination probability for each hidden state sas p(END|s) • EM parameter estimation
Concatenated HMMs • After training, concatenate HMMs for each phase to form one HMM for each gesture • Compute transition probability from the previous phase to the next phase • Ensure
Gesture spotting & recognition no nucleus phase • Detect rest vs non-rest segments • Find concatenated HMM that gives the highest probability • Find most probable hidden state sequence using Viterbi • Assign hidden states to corresponding phases • Identify segment without nucleus phase
Gesture recognition result • 10 users and 10 gestures and 3 rest positions • 3-fold average
Gesture recognition result • User independent training and testing • 3-fold average
Contributions • Employed novel gesture phase differentiation using concatenated HMMs • Used hidden states to • identify movements with no nucleus phases • accurately detect start and end of nucleus phases • Improved hand tracking when the hand is close to the body or moving fast by gesture salience detection