180 likes | 213 Views
This thesis explores the design and implementation of a Gesture Recognition System, focusing on its application in necessity kiosks and vehicle control. The system utilizes MTrack software, which runs in Windows and supports COTS hardware. It can recognize various gestures including static and dynamic gestures, enabling control in video gaming, large screens, OS, and novelty functions. The classifier can identify fundamental gestures and their variations for a total of 9 actions. The architecture involves 5 stages, including RGB to HSV conversion, image thresholding, CAMSHIFT, microstate assignment, action engine, and macrostate assignment using Win 32 API. The system addresses noise through mathematical morphology operations and discriminant Hu invariant moments for scale, rotation, and translation invariance. Classification is based on the Mahalanobis Distance with a focus on feature vectors, mean vectors, and covariance matrices. Future enhancements include video filtering, morphological filtering, trainable data sets, and macrostate improvements, offering contextual analysis before making decisions.
E N D
Design & Implementation of a Gesture Recognition System Isaac Gerg B.S. Computer Engineering The Pennsylvania State University
Necessity Kiosks Vehicle Control Video Gaming Large Screen OS Control Novelty
Types of Gestures Static Gestures Dynamic Gestures
MTrack Software Characteristics • Runs in Windows • COTS Hardware Support • Utilizes DirectX Classifier Characteristics • Recognize four fundamental gestures plus variations for a total of 9 actions.
System Architecture 5 Stages
System Architecture Stages (in order or processing) • RGB to HSV Colorspace conversion. • Image Thresholding (pdf) • CAMSHIFT • Microstate Assignment • Action Engine • Macrostate Assignment • Win 32 API
Dealing with Noise Mathematical Morphology Operations
Discriminant Hu Invariant Moments Scale, Rotation, and Translation Invariant
Classification The need for a Distance Metric.
Classifier The Mahalanobis Distance Minimum Distance Classifier xt = feature vector at time t of unknown class. m = mean vector of samples. S = covariance matrix of samples.
Micro/Macrostates Statistical physics paradigm Last chance to correct before taking action Provides contextual analysis Implemented using order statistics
The Future Video Filtering (Wiener Filtering, Kalman Filtering) Morphological Filtering Trainable Data Sets Macrostate Improvement
References http://www.galactic.com/Algorithms/discrim_mahaldist.htm J. Flusser and T. Suk, "Affine Moment Invariants: A New Tool for Character Recognition, " Pattern Recognition Letters, Vol. 15, pp. 433-436, Apr. 1994. Bradski, G. R., “Computer Vision Face Tracking For Use In A Perceptual User Interface.” Intel Technology Journal, 1998(2).