CSC 2528: Handshapes and Movements: Multiple-channel ASL recognition

CSC 2528: Handshapes and Movements: Multiple-channel ASL recognition Christian Vogler and Dimitris Metaxas (presented by Christopher Collins) University of Toronto Computer Science

Overview: Part II • Introduction to ASL recognition • Challenges of ASL recognition • Related work • Modelling • Phoneme-based modelling • Independent Channels • Handshape • Parallel Hidden Markov Models • Experiments • Conclusions and Future Work University of Toronto Computer Science

ASL Recognition: Introduction • Computer interaction is still mainly keyboard/mouse • requires literacy in a written language or an agreed-upon standard written form of ASL (e.g. sign-writing) • difficult for many people who are deaf University of Toronto Computer Science

ASL Recognition: Challenges • More difficult than speech recognition due to: • simultaneous events University of Toronto Computer Science

ASL Recognition: Challenges • More difficult than speech recognition due to: • simultaneous events • inflections University of Toronto Computer Science

ASL Recognition: Challenges • More difficult than speech recognition due to: • simultaneous events • inflections • phonology poorly understood, no agreed standard University of Toronto Computer Science

Challenges of Simultaneity University of Toronto Computer Science

Related Work • C. Vogler and D. Metaxas. Parallel Hidden Markov Models for ASL Recognition (1999). • G. Fang et al. Signer-independent continuous sign language recognition based on SRN/HMM (2001). • R.-H. Liang and M. Ouhyoung. A real-time continuous gesture recognition system for sign language (1998). University of Toronto Computer Science

Overview • HMM-based approach to ASL recognition • parallel HMMs for different channels • channels are left and right handshape and movement • uses the movement-hold phonology University of Toronto Computer Science

Movement-Hold Example University of Toronto Computer Science

Handshape Modelling • Most previous work uses joint and abduction angles as features (low-level) • Also experiment with a measure of the openness of a finger (high level) • height and width of quadrilateral • MPJ angle • abduction angles University of Toronto Computer Science

Extensions to HMM • Regular HMM model one process evolving over time • To model parallel, possibly interacting processes with a regular HMM, events must evolve in lockstep • Earlier work by Vogler and Metaxas explains development of parallel HMM model University of Toronto Computer Science

Factorial HMM University of Toronto Computer Science

Coupled HMM University of Toronto Computer Science

Parallel HMM University of Toronto Computer Science

Combination of Processes • Using independence assumption, combine path probabilities (from each channel, with states representing the same sign sequence) by multiplying them. Choose the most probable state sequence. • Time is polynomial in number of states, linear in number of parallel processes More info: C. Vogler and D. Metaxas, Parallel Hidden Markov Models for ASL Recognition; Proc. Int. Conf. on Comp. Vis., Greece, 1999. University of Toronto Computer Science

Experiments • Compare handshape models (joint angles vs. quadrilateral) for handshape recognition task • Compare PaHMM model with various channel combinations against single hand movement channel (naïve baseline?) • Vocabulary of 22 signs, 400 training sentences of length 2-7 signs, and 99 test sentences • Omitted left-hand handshape? University of Toronto Computer Science

Choice of Handshape Model • Measure correctly recognized handshape (recognizing signs with handshape alone not possible) • Quadrilateral feature vector results in better (and more consistent) recognition accuracy University of Toronto Computer Science

Experimental Results H=correct, D = deletion, S = substitution, I = insertion, N = number University of Toronto Computer Science

Conclusions • Handshape information is important in ASL recognition • Parallel HMM a promising model for multi-channel data University of Toronto Computer Science

Future Work • Training/Test data from native signers • Include facial expressions • Use of relative spatial information (classifiers) • Larger vocabulary • Incorporation of language modelling to improve recognition, such as n-gram or parsing University of Toronto Computer Science

CSC 2528: Handshapes and Movements: Multiple-channel ASL recognition

CSC 2528: Handshapes and Movements: Multiple-channel ASL recognition

Presentation Transcript

Political Activism and Youth Movements in Russia

An overview of the SPHINX Speech Recognition System

Child Abuse: Recognition and Reporting

Use of Sound in Games

Revenue Recognition

ECSE 6961 The Wireless Channel

Wireless Medium Access Control Romit Roy Choudhury Wireless Networking Lectures Duke University

Multiple Access Techniques for Wireless Communication

Multiple Myeloma

Selecting the Channel Members

Motivating the Channel Members

Recognition Part I

Chapter 6

Migration and Attenuation of Surface-Related and Interbed Multiple Reflections

PIA 2528

Institute of Information Theory and Automation Introduction to Pattern Recognition

Introduction to Pattern Recognition Chapter 1 ( Duda et al.)

MULTIPLE INTEGRALS

CHAPTER 8 PLATOON DRILL

CONCUSSION RECOGNITION AND MANAGEMENT

Design and Implementation of Speech Recognition Systems

Sequence Scoring Experiments Using the TIMIT Corpus and the HTK Recognition Framework