250 likes | 442 Views
How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology (CLST) Radboud University Nijmegen, The Netherlands. Radboud University Nijmegen. Overview. Introduction ASR: automatic speech recognition
E N D
How to integrateautomatic speech recognition (ASR) into CALL applicationsHelmer StrikDepartment of LinguisticsCentre for Language and Speech Technology (CLST)Radboud University Nijmegen, The Netherlands Radboud University Nijmegen
Overview • Introduction • ASR: automatic speech recognition • ASR-based tutoring • ASR-based CALL • ASR-based literacy training • Conclusions LESLLA, Antwerpen, 24-11-2008
Introduction • Students who receive 1-on-1 instruction perform as well as the top two percent of students who receive traditional classroom instruction [Bloom 1984] • A human tutor for every student is not feasible computer tutors For language learning: CALL Many text-based CALL systems Include speech speech-based CALL system LESLLA, Antwerpen, 24-11-2008
Speech inside • Many applications with ‘speech’: • Screen readers [#] • Reading pen • Mobile phone: photo + OCR + TTS • Some also (useful) for CALL • [#] LESLLA, Antwerpen, 24-11-2008
Speech inside (cont’d) • Many applications with ‘speech’ Screen readers, reading pen, etc. • Some also (useful) for CALL • However, usually the learner can • only listen (TTS: text-to-speech) • or, also speak, but … • no assessment, or • the learner has to carry out the assessment, e.g. by comparing with examples • use ASR / speech technology • Is it feasible? LESLLA, Antwerpen, 24-11-2008
ASR: automatic speech recognition • What is ASR? • Speech to text conversion • Applications: • Dictation • Command and control • Spoken dialogue systems (information) • etc. • ASR is not flawless, and it will probably never be • esp. for non-native speech • Note: this is not even the case for humans! LESLLA, Antwerpen, 24-11-2008
cgn2-s vb Speech Recognition mii nn LESLLA, Antwerpen, 24-11-2008
ASR-based tutoring • ITS: Intelligent Tutoring Systems • Spoken dialogue system for learning • Subject matter: math, physics, etc. • Examples: • ITSPOKE, Univ. of Pittsburgh, Litman et al. Topic: Physics • SCoT, Stanford Univ., Peters et al. Topic (SCoT-DC): shipboard damage control • Communicate with speech • the subject matter doesn’t have to be speech LESLLA, Antwerpen, 24-11-2008
ASR-based CALL • The subject matter is speech (language) • Late 1990’s: • 1998: STiLL, Marholmen (Sweden); 1st time the CALL and Speech communities met • 1999: Special Issue of CALICO, 'Tutors that Listen‘, focusing on ASR (mainly ‘discrete ASR’) LESLLA, Antwerpen, 24-11-2008
ASR-based literacy training • What has been done? • Reading tutors (the learner reads, not the PC): • Listen, CMU, Pittsburgh; Mostow et al. (1994) • STAR system, UK; Russel et al. (1996) • SPACE, KU Leuven; Van hamme, Duchateau, et al. • … and many others [#] • FtL: Foundations to Literacy, Boulder; Cole, Wise, et al. LESLLA, Antwerpen, 24-11-2008
ASR-based literacy training • Foundations to Literacy • Interactive Books • Teach fluent reading & comprehension • Foundational Skills Tutors • Teach underlying reading skills Phonics LESLLA, Antwerpen, 24-11-2008
ASR-based literacy training (cont’d) • What has been done? • Reading tutors: • Listen, CMU, Pittsburgh; Mostow et al. (1994) • STAR system, UK; Russel et al. (1996) • SPACE, KU Leuven; Van hamme, Duchateau, et al. • …, and many others • FtL: Foundations to Literacy, Boulder; Cole, Wise, et al. • Mostly for children • And for adults? • What is needed? • What is possible, and what is not? • … LESLLA, Antwerpen, 24-11-2008
ASR-based CALL • ASR is not flawless, and it will probably never be • esp. for non-native speech • Be aware of what is (not) possible with ASR technology • Problematic issues and possible solutions: • Noise, esp. background speech min., head-sets • Disfluencies min., improve autom. handling • Non-native pronunciation • Recognizing utterances utterance verification • Detect pronunciation errors classifiers LESLLA, Antwerpen, 24-11-2008
ASR-based CALL • Our research: • Non-natives • Assessment of oral proficiency • Dutch-CAPT – pronunciation • ASR / UV – Utterance Verification • PED – Pronunciation Error Detection • DISCO – pronunciation, morphology, syntax • TST-AAP • People with speech disability for training & as communication aid (AAC) • ASR for dysarthric speech • EST: E-learning based Speech Therapy LESLLA, Antwerpen, 24-11-2008
ASR-based CALL • Project Dutch-CAPT • (Computer Assisted Pronuciation Training) LESLLA, Antwerpen, 24-11-2008
ASR-based CALL (cont’d) • Project Dutch-CAPT • (CAPT: Computer Assisted Pronuciation Training) • Exp. group: used the Dutch-CAPT system • 2 control groups: didn’t use Dutch-CAPT • The reduction in the number of pronunciation errors made was significantly larger for the exp. group, • Training: 4 weeks x 1 session of 30’ – 60’ LESLLA, Antwerpen, 24-11-2008
ASR-based CALL (cont’d) • ASR is not flawless, and it will probably never be • esp. for non-native speech • Be aware of what is (not) possible with ASR technology • Problematic issues and possible solutions: • Noise, esp. background speech min., head-sets • Disfluencies min., improve autom. handling • Non-native pronunciation • Recognizing utterances utterance verification • Detect pronunciation errors classifiers • Mix of expertise needed: • ASR techn., L-acq., pedagogy, design, … LESLLA, Antwerpen, 24-11-2008
ASR-based literacy training • Demonstration project TST-AAP • Existing course • Add speech technology: • Detect whether words & sounds were pronounced (correctly) LESLLA, Antwerpen, 24-11-2008
ASR-based literacy training • Listening; PC: produces speech • Text-To-Speech (TTS); quality good enough? • Recorded speech, concatenation • Speaking; PC: recognizes speech • Phonics (see FtL) • PC: Recognize words, utterances: CMs for Utt. Ver. • PC: Recognize sounds: CMs for Phon. Ver. (contrasts) • Reading (reading tutors) • PC: Recognize words, utterances • PC: Pointer in the text (‘track’ the reader) • PC: Help when encountering problems • PC: Change tempo read faster LESLLA, Antwerpen, 24-11-2008
ASR-based CALL • Advantages of using speech (vs. writing) • Self-explanation • Extra information: • Prosody (stress, accent) • Emotions • Confidence • Other useful techniques: • VTH [#] LESLLA, Antwerpen, 24-11-2008
Conclusions • ASR is not flawless • ASR-based tutoring is possible (restricted domain) • general topics; ITS: ITSPOKE, SCoT • CALL; many systems: non-natives, disabled, etc. • Literacy training • So far mainly for children • And for adults !? • Needed • Mix of expertise: techn., L-acq., pedagogy, design, … • Improved ASR, speech technology • Projects, funds LESLLA, Antwerpen, 24-11-2008
Questions? • Why are there so few ASR-based CALL / literacy applications for adults? • What are, in this context, important differences between children & adults? • What is needed? • Listening; PC: produces speech • Speaking; PC: recognizes speech • Phonics • Reading (reading tutors) • What else? THE END LESLLA, Antwerpen, 24-11-2008
Questions? • Why are there so few ASR-based CALL / literacy applications for adults? • What are, in this context, important differences between children & adults? • What is needed? • Listening; PC: produces speech • Speaking; PC: recognizes speech • Phonics • Reading (reading tutors) • What else? LESLLA, Antwerpen, 24-11-2008