1.01k likes | 3.34k Views
Sign Language Recognition and Translation:. A multi-disciplined approach from the field of artificial intelligence. Presentation by Becky Sue Parton. Sign Language Recognition and Translation – Becky Sue Parton - 2002. Is it time for the Signing Robot to take over now ?? Please!!.
E N D
Sign Language Recognition and Translation: A multi-disciplined approach from the field of artificial intelligence Presentation by Becky Sue Parton Sign Language Recognition and Translation – Becky Sue Parton - 2002
Is it time for the Signing Robot to take over now ?? Please!!
Sign Language – Important Concepts • Signed Languages are not international. ASL (American Sign Language) is used in the United Sates; BSL (British Sign Language) is used in England; etc • Receptive refers to “reading” signs; Expressive refers to “rendering” signs. • ASL is not a “visual” form of English! It is another language complete with its own grammar and rules. • When people refer to ‘sign language’, they may mean: • Fingerspelling – visual alphabet used to spell proper nouns etc • Signed Exact English – “Sign supported speech” which uses ASL Signs and invented signs in English word order. Signed English is not a language. • CASE (Conceptually Accurate Signed English) – Signed English but with idioms and other phrases signed to illustrate the meaning of the words. Ex. You are “bugging” me. (sign “bother”) • ASL – True language of the Deaf in the United States. Sign Language Recognition and Translation – Becky Sue Parton - 2002
Fingerspelling Hands – A Brief History • 1977 – Southwest Research Institute • First robotic hand • Not all letters formed properly • 1985 – Dexter • Developed at Stanford University • Modifiable finger positions • 1988 – Dexter II • 4 letters per second • Smaller and reliable Motions • 1992 – Gallaudet’s Hand • Fluid movements • Ability to be connected to a TTY Dexter Fingerspelling hands were created to assist Deaf/Blind individuals by replicating the hand-on-hand interpreting of the manual alphabet of a human. Robotics Sign Language Recognition and Translation – Becky Sue Parton - 2002
Fingerspelling Hands – Continued • Accepts input from a variety of sources including modified caption systems • More natural fingers • 1994 – RALPH • Robotic ALPHabet • Developed by Rehabilitation Research and Development Center (David Jaffe) • Fourth generation computer-controlled electro-mechanical fingerspelling hand • Improved mechanical system (elimination of the pulleys) • Compact and faster • Menu-driven user interface which allows finger positions to be edited Robotics Sign Language Recognition and Translation – Becky Sue Parton - 2002
Capture Gloves Capture Gloves are used as an input device to be able to “see” what the movement of the hands. The purpose is to capture data about the signs so they can be deciphered. • Patterson’s Glove • Developed in 2002 by 17 year old Ryan Patterson • Simplistic design using a leather golf glove with 10 sensors, a small circuit board containing a micro controller, an analog-to-digital converter, and a radio-frequency transmitter • Works by “sensing the hand movements of the sign language alphabet then wirelessly transmitting the data to a portable device that displays the text on-screen” • Must train the glove (like voice recognition) but it is a quick process. • User can customize a hand movement to mean a particular word Virtual Reality Sign Language Recognition and Translation – Becky Sue Parton - 2002
Capture Gloves - Continued • Kramer’s Talking Glove “CyberGlove” • Developed by James Kramer, founder of Virtual Technologies (bought out by Immersion 3D) • “Lightweight, flexible glove with sensors which accurately and repeatably measure the position and movement of the fingers” • Design – 18 or 22 sensors; serial cable connected to the host computer; must train to define vocabulary • Originally used to input fingerspelling only then whole signs; recognized letters/signs using a prototyping algorithm at first then a neural network • Cost $6000 approximately (the “GesturePlus” software was sold separately at $3500) • VPL Data Glove • Sensors are fiber optic transducers which measure the finger flex angles etc. Virtual Reality Sign Language Recognition and Translation – Becky Sue Parton - 2002
Camera-based Capture Devices Camera-based Capture Devices have the same purpose as the virtual reality glove system – to collect data about how signs are formed so that they can be “recognized” and analyzed by the computer. • 1992 – Davis and Shah did research using a camera focused on a hand that had markings on the tip of each finger • 1994 – Dorner and Hagen used a camera along with special gloves with rings of color around each joint on the signer • 1995 – Thad Starner’s camera saw a signer with two different colored gloves which helped reference their position • 1996 – Thad Starner then did research showing a camera could capture signs with natural hands (no gloves). The camera could be on a desk or mounted in a cap that the Deaf person wears. Computer Vision Sign Language Recognition and Translation – Becky Sue Parton - 2002
Comparing Capture Methods • Gloves – Advantages • Data is very accurate • Not much computing power required • Gloves – Disadvantages • Encumbers the user • Doesn’t recognize facial gestures • Cameras – Advantages • User does not have to wear anything • Cameras – Disadvantages • Complex computations must be performed on the images • Sensitive to lighting, distance, etc. • Costly • Hard to see the fingers Sign Language Recognition and Translation – Becky Sue Parton - 2002
Sign Recognition Software Capture gloves or cameras are like the mouse or keyboard on a computer. They are an input device – software must the movements captured and make sense of them. • 1993 – Early system designed in Canada. Capture was done using a VPL Data Glove. “It was connected to a DECtalk speech synthesizer via five neural networks to implement a hand-gesture to speech system.” This system recognized a hand-shape “root word” and then added an ending to the word based on which of 6 directions the hands moved. (These “signs” were basically gestures not ASL vocabulary.) The network was trained using back propagation. The 5 networks were: strobe time, root word, ending, rate, & stress. • 1999 – University of Zurich used the CyberGlove to capture signs. The Stuttgart Neural Network Simulator (freeware) was used to design and train the neural networks. User must “map a set of angular measurements as delivered by the data glove to a set of pre-defined hand gestures”. Only individual gestures used. Neural Networks Sign Language Recognition and Translation – Becky Sue Parton - 2002
Sign Recognition Software HMM is a statistical approach which is used prominently in speech and handwriting recognition and is becoming popular in sign language recognition as well. • MIT - captured the data with a camera system and then analyzed with HMM. • Christopher Lee - did research using a CyberGlove to capture the data and then analyzed with HMM. Using interactive software, the computer could learn new gestures. • Liang and Ouhyoung – In 1997 were able to recognize continuous Taiwanese Sign Language using a capture glove and HMM Hidden Markov Models Sign Language Recognition and Translation – Becky Sue Parton - 2002
Sign Recognition Software G.R.A.S.P.(Glove-based Recognition of Auslan using Simple Processing) • Developed by Mohammed Waleed Kadous in Australia in 1996. • Uses a PowerGlove from Nintendo (low level device) to capture the data. • Uses machine learning (instance-based and decision-trees). • Recognized isolated signs only. • Learns by examples rather than trying to match signs to a dictionary because different signs vary between different people. Machine Learning Sign Language Recognition and Translation – Becky Sue Parton - 2002
Rendering Sign Language VRML (Virtual Reality Modeling Language) “is a standard language for describing 3D computer graphics models on the www.” It can be used to render ASL fingerspelling. • Su (University of Maryland) and Furuta (Texas A&M) have developed a working web site to illustrate this technique. http://www.csdl.tamu.edu/~su/asl • SASL (South African Sign Language) is another example created in 2002. VRML Sign Language Recognition and Translation – Becky Sue Parton - 2002
Rendering Sign Language Avatars (aka a synthesized signer or personal digital signer) are “virtual people” – 3D animated images. In our case, avatars can be designed to sign words and sentences instead of humans. • Vcom3D Software – An avatar reads books for Deaf children. Presentation is in CASE. • Different avatars are available including a cyber-lizard. • The technology is also being used to sign the internal web-based activities for two “Kids Network” units. • Research study at the Florida School for the Deaf and Blind showed a jump in comprehension of a story from 17% to 67% after seeing it signed. www.vcom3d.com 3D Animation Sign Language Recognition and Translation – Becky Sue Parton - 2002
Rendering Sign Language Tessa(Text and Sign Support Assistant) • What is Tessa? - a project in the United Kingdom. It is designed to be a post office assistant which combines speech recognition technology and virtual human animation. • How does she work? - The postal employee voices a statement and the software chooses the matching motion file. The inputs are pre-defined phrases and the output is pre-defined BSL. • How was the system designed? – Native signers were captured using a virtual reality “suit” (sensor on hands, body, and face). Signs for use by the avatar were then analyzed using the HamNoSyn coding system and designed with animation software. (Coding systems will be discussed later.) • Where can I see Tessa? – She was on display at the Science Museum in London in 2001. A demo of the capture process and the field trial is at www.visicast.co.uk 3D Animation Sign Language Recognition and Translation – Becky Sue Parton - 2002
Rendering Sign Language - Simon • What is Simon? – A project in the United Kingdom which uses an avatar, Simon, to transliterate printed text (television captions) into sign language. It is in Signed English not BSL but the future goal is actual translation into BSL. • What are the benefits of Simon? • Reduced cost for “interpreting” of television programs. Live human signers are expensive and can be hard to find. • Ability to be turned on or off like closed caption so that hearing viewers don’t see the “interpreting bubble”. “The system could generate Deaf signing at the viewer end of the broadcast chain.” Different avatars could be selected by the user. • What is the rationale behind Simon? – “Closed captions are not as effective for a deaf person as subtitles in a foreign movie are for a hearing person since the closed captions are not in the Deaf person’s native language.” • How does Simon work? – Signs (the motion-capture data) were acquired from expert signers using: a CyberGlove (for hand movements), an Ascension Motion Star wireless magnetic body suit (for upper torso, arm, and head positions), and an optical Facetrak device (for facial expression and lip position). When a word needs to be transliterated it is then looked up in the word dictionary and finds the accompanying physical movement, facial expressions, and body positions which are stored as motion-capture data (not images or video). 3D Animation Sign Language Recognition and Translation – Becky Sue Parton - 2002
Translating – The Notation Phase A Notation system is a way to code the features of sign language. Typically, they try to represent: hand configuration, hand orientation, relation between hands, direction of the hands motion, and additional parameters. By knowing the features, a sign can then be recreated from the data. The components can be analyzed like English grammar can. • Informal Glossing – ex. DANCE = “rVpitsd moves side to side abv lputso” (right “V” palm in fingertips down moving side to side above left palm up fingertips out) • HamNoSys System – Developed as a scientific/research tool in 1989. Consists of 200 symbols covering the parameters (discussed above). Transcriptions are precise but long and cumbersome to decipher. Facial expressions can be written. (Tessa project ex). • Stokoe System – Created by William Stokoe, the Father of ASL to show that the components of ASL fit together to form a linguistic structure identical with that of spoken language. There are 55 symbols covering the parameters. • Szczepankowski System – Another way to transcribe signs – used in the THETOS project. It can be incomplete and inexact and highly intuitive at times. • Sign Writing – Invented by Valerie Sutton in 1974 but has only recently gained popularity. It is a way to record the movements of any signed language. It contains over 600 symbols which can describe all of the parameters mentioned above. Natural Language Processing Sign Language Recognition and Translation – Becky Sue Parton - 2002
Translating – The Notation Phase Cont. Example below shows a comparison of the various notation systems: • SignStream Project – SignStream is a multimedia database tool designed to facilitate video-based linguistic research. It contains digitized video data and a representation of the data in a linguistically format. It was created in 1997. A mac demo can be downloaded for 30 days at http://web.bu.edu/ASLLRP/SignStream Natural Language Processing Sign Language Recognition and Translation – Becky Sue Parton - 2002
Translating between Sign Languages & Oral Languages THETOS (Text into Sign Language Automatic Translator for Polish) • What does automatic translation mean? – “It is a translation process in which one puts to the input a text which consists of words not supplemented by any hints, on output one gets a text of equivalent content, under the form of a sequence of signs, and the transformation of the input text to the output one is done without man’s interferences. It acquires input data in the form of a text file, provides its full linguistic analysis (morphological, syntactic, and semantic) and finally produces the output in form of an animated sequence” • Where is Thetos used? – It is used in Poland in medical settings only. • What animation Technique is used? – OpenX (options are VRML, DirectX, or OpenX) • How is the animation sequence rendered? – After the translation process, the gestographic notation allows the animation to be created. A word does not have a matching sign per se – it has characteristics which put together result in a sign. Therefore, translation into different sign languages is relatively easy. Natural Language Processing Sign Language Recognition and Translation – Becky Sue Parton - 2002
Translating – THETOS continued The following abbreviated example shows how NLP works in the THETOS project:NOTE: You can download a demo at sun.iinf.polsl.gliwice.pl/sign/ • Napisz swoje nazwisko (Write your surname) • As the result of parsing we get:- Predicate = napisac (to write); Subject = ty (hidden) (you); Object = swoje nazwisko (your surname) • The intermediary form is textual but its syntax is typical for the sign language:- Ty pisac nazwisko twoje (You write surname your) • The gestographic representation (Szczepankowski’s notation) is:- PZ:21kpg#III<! Ty you- PE:23k }/LBk:13k # P:III\V<-” pisac write- P1:25tg+ # II<-” nazwisko surname- PT:35k # III!” twoje your • The resulting animation sequence: Natural Language Processing Sign Language Recognition and Translation – Becky Sue Parton - 2002
Translating Paula – DePaul University‘s Digital Synthetic Interpreter • Most current research – started in March 1998 by Karen Alkoby (a Deaf graduate student) and is still in progress and is being done in the USA! • The goal of the Paula project is to translate English to ASL. A portable system is desired. • A pilot test was done with only the fingerspelling portion at the Deaf Expo and was successful. • One of the first applications of Paula will be airport security. The research team also wants to use her to replace closed captions (much like Simon in the UK). Natural Language Processing Sign Language Recognition and Translation – Becky Sue Parton - 2002
Translating – Paula continued What is unique about Paula? One of the first steps in the project was to create a database of signs to be used as the lexical database for the translator. Most researches (as we’ve seen in this presentation) capture the signs through the use of a data glove. The Paula project, however, took a different approach because the motion capture data is often inaccurate and is recorded in numerical data that is hard to modify. Paula uses an animation software package that has been customized for sign transcription. The signs are being built by selecting hand shapes, positions, etc from a friendly menu-driven interface. Asl.cs.depaul.edu gives up-to-date progress info! Natural Language Processing Sign Language Recognition and Translation – Becky Sue Parton - 2002
Other Combination Projects Another project working with Auslan Sign Language is a Tutorial system for learning signs. The system uses a 3D avatar to render signs. The software has a “Sign Editing Interface” where new signs can be designed. Its focus is to help teach Auslan sign to students. Currently work is being done to add a grammar parser so that translation can be possible. The University of Pennsylvania started a project entitled “TEAM” (Translation from English to ASL by Machine), but it appears to have been discontinued. Their total system used a Lexicalized Tree Adjoining Grammar based system (TAGs) for the NLP step. They also wanted to translate back from ASL to English. The Chinese are also developing a system for both GTS (Gesture to Spoken language) and STG (Spoken to Gesture Language). They are using a hybrid approach which includes data gloves, cameras, and 3D virtual humans. The project is called “HandTalker”. This project, started in 2001, has a goal to support continuous signs with no domain limitation. Sign Language Recognition and Translation – Becky Sue Parton - 2002
Educational Applications • SignWriting • Currently a person must manually transcribe ASL into “sign writing”. (Free software is available to produce the symbols at www.signwriting.org ) However, with the research in NLP using other notation systems, it may be possible to complete the process automatically into this format as well. • The use of sign writing is controversial and not within the scope of this paper to discuss, however it does help students to pay attention to how they sign and the result is the study of their own language’s grammar and form. According to the SignWriting Literacy project, “SignWriting seems to also help Deaf kids learn to read English words faster.” • Sign Dictionaries • A student could wear the Virtual Reality glove and sign the word to see what the English equivalent is or what the definition is. Currently, in order to look-up a sign in a dictionary, one must know the English gloss. ASL word tests could be conducted also. Sign Language Recognition and Translation – Becky Sue Parton - 2002
Educational Applications Cont. ICICLE Project(Interactive Computer Identification and Correction of Language Errors) • What is the Icicle Project? – “A current project designed to provide writing assistance for second language learners of English, specifically American Sign Language natives. The system will analyze written English texts from Deaf individuals, identifying possible errors and generating tutorial text tailored to each writer’s level of language competence and particular learning strengths.” • What are the main parts of ICICLE? – The first module is called “Error Identification”. It is here through NLP, that the analysis of the writing is performed. The system is more advanced than a “grammar check” because it is tailored to understand the errors that are common in Deaf writing. The second module is called “Response Generation”. It is during this phase that intelligent tutoring advice is given. Currently instruction is in English, but the goal is to use an avatar to respond in ASL. Intelligent Computer Aided Instruction Sign Language Recognition and Translation – Becky Sue Parton - 2002
Educational Apps – ICICLE Cont. Special thanks to Dr. Kathleen McCoy and Dr. Lisa Michaud ! • NLP – aspects of the system • TRAINS text parser • COMLEX Syntax = lexicon which contains 38,000 different syntactic head words and grammar rules • Mal-rules = typical errors made by Deaf learners • SLALOM = steps of language acquisition in a layered organization model (knowledge base) Natural Language Processing Sign Language Recognition and Translation – Becky Sue Parton - 2002
Educational Applications Continued Authorware gives the user (students and instructors) the ability to create their own avatar translations for inclusion in videos, television programs, web sites, CDROMS, etc. • Vsign – Software developed originally for Netherland Sign Language but is a versatile program that allows the user to create animated sequences in any sign language. User must construct the animations from scratch. It is freeware! Downloadat www.vsign.nl/EN/vsignEN.htm . (April 29, 2002 release date.) • Sign Smith Studio – World’s first commercial product for creating animated sign language and publishing it.(VRML). Release date of December 2002 at a cost of approx $4000. Many benefits over the Vsign software including: • Dictionary of signs – user can choose from 2000 signs. • Software will transcribe your English sentences automatically to Signed English. Software has the capability to allow user to translate sentences to ASL. • Features non-linear editing tools and will resolve words with multiple meanings. 12 different avatars to choose from! • Five different tracks allow user to not only select sign, but also facial expression, mouth movement, etc! Avatar Authorware Sign Language Recognition and Translation – Becky Sue Parton - 2002
Future Research – Obstacles & Possibilities • Obstacles • Analogies, poetry, music lyrics, and other vague concepts are still difficult for computers. • The barriers, which are slowly coming down, in terms of receptive sign language processing are continuous strings and signer variability. • Possibilities • Deaf children could have “pen pals” in other countries if the software could translate from one sign language to another. (Richard Bowden is working on BSL to ASL currently.) • Relay Texas (and similar systems in the other states) could be entirely replaced so that Deaf persons would have private phone conversations with hearing people and use their native language. • Tests could be translated from English to ASL so that teachers can evaluate what content Deaf students understand apart from their understanding of English. • Deaf students can design their own projects using the authorware described previously. • Research can be conducted to confirm that students understand the avatars as well (or better) than human interpreters. • Another research topic could be to see if having notes in “signwriting” help Deaf students remember and present their presentations better than note cards in English? Sign Language Recognition and Translation – Becky Sue Parton - 2002
TheEnd • But if you still want to learn more: • Check out commtechlab.msu.edu/sites/aslweb/browse.htm to learn some sign language (human clips) just for fun. • Attend the 5th International Workshop on Gesture and Sign Language based Human – Computer Interaction on April 15-17, 2003 in Italy! • Ask me for the complete list of references Sign Language Recognition and Translation – Becky Sue Parton - 2002