1 / 27

Language Resources and CALL

niran
Download Presentation

Language Resources and CALL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Language Resources and CALL ApplicationsHelmer Strik1, Jozef Colpaert2, Joost van Doremalen1, and Catia Cucchiarini11 Centre for Language and Speech Technology (CLST) Dept. of Linguistics, Radboud Univ. Nijmegen, The Netherlands2 Linguapolis, University of Antwerp, Antwerp, Belgium

  2. Language Resources and CALL • The current presentation: • The relation between • language resources and CALL systems • CALL: Computer Assisted Language Learning • We focus here on the project DISCO: • Development and Integration of Speech technology into COurseware for language learning LREC 2010, Malta, 22-05-2010

  3. Overview • A short introduction to DISCO • Resources used to develop a CALL system • Resources obtained during development of a CALL system • Resources obtained using a CALL system • Conclusions Dr. Spraak (Dr. Speech) LREC 2010, Malta, 22-05-2010

  4. A short introduction to DISCO • DISCO project: • develop a prototype of a CALL system • that can give feedback • on spoken utterances • Levels: • pronunciation (of sounds) • grammar (syntax & morphology) LREC 2010, Malta, 22-05-2010

  5. LREC 2010, Malta, 22-05-2010

  6. LREC 2010, Malta, 22-05-2010

  7. Syntax exercise LREC 2010, Malta, 22-05-2010

  8. Morphology exercise LREC 2010, Malta, 22-05-2010

  9. Pronunciation exercise – with feedback LREC 2010, Malta, 22-05-2010

  10. Menu: conversation environment report, learner is listening to own speech in complete conversation LREC 2010, Malta, 22-05-2010

  11. Menu: conversation environment report, learner is reviewing pronunciation mistakes by listening to own speech LREC 2010, Malta, 22-05-2010

  12. Menu: remediation environment, overall scores for phonemes, learner can start remediation by clicking on a phoneme LREC 2010, Malta, 22-05-2010

  13. Menu: remediation environment, pronunciation exercise LREC 2010, Malta, 22-05-2010

  14. Menu: remediation environment, learner is reviewing progress LREC 2010, Malta, 22-05-2010

  15. Characters in DISCO LREC 2010, Malta, 22-05-2010

  16. ASR-based CALL • ASR: Automatic Speech Recognition • standard ASR: from (native) speech to words LREC 2010, Malta, 22-05-2010

  17. ASR: Automatic Speech Recognition LanguageModel AcousticModels Lexicon Decoder W1 W2 W3 W4 Speech SignalInput WordsOutput LREC 2010, Malta, 22-05-2010

  18. ASR-based CALL • ASR: Automatic Speech Recognition • standard ASR: from (native) speech to words • ASR for CALL, 2 phases: • 1. content, what has been said, tolerant; • recognize words despite non-native variation • 2. form, how has it been said, strict; • error detection, find deviations from native … LREC 2010, Malta, 22-05-2010

  19. Resources used to develop a CALL system (1) • More general, native resources: • ASR toolkit – e.g. SPRAAK [from Stevin] • Corpus with native speech – e.g. Spoken Dutch Corpus (CGN) [from TST-Centrale] • Native lexicon – e.g. e-Lex [from TST-Centrale] LREC 2010, Malta, 22-05-2010

  20. Resources used to develop a CALL system (2) • More specific, non-native resources (often not available) to develop / improve the 2 phases: • Phases 1 + 2. Corpora with non-native speech : JASMIN [from Stevin]; CITO, Triest, Dutch-CAPT • Phase 1. word recognition, content Resources, information to model non-native 'behavior', in order to improve: • Acoustic Models: mainly by training on non-native audio (from speech corpora) • Lexicon & Language Model: data-driven, from non-native audio, or knowledge based, from lit. etc. LREC 2010, Malta, 22-05-2010

  21. Resources used to develop a CALL system (3) • More specific, non-native resources (often not available) to develop / improve the 2 phases: • phase 2. error detection (classifiers), strict; • A. Decide which errors to address, criteria + selection => inventory data-driven and/or knowledge based • B. Develop classifiers, train and test; data-driven • A & B. data-driven => Resources needed: annotations for audio Levels: • Pronunciation: sounds [& prosody, not in DISCO] • Grammar: syntax & morphology LREC 2010, Malta, 22-05-2010

  22. Resources obtained during development • Blue-print of the design • Content • specifications for exercises and feedback strategies • a list of predicted correct and incorrect utterances • Modules for the 2 phases: 1. word recognition, 2. error detection • The CALL system itself, the whole system, prototype with content LREC 2010, Malta, 22-05-2010

  23. Resources obtained using a CALL system • Audio recordings • Log-files: user + system 'behavior' • Videos LREC 2010, Malta, 22-05-2010

  24. Conclusions • Language Resources • important role in relation to CALL systems • Language Resources • are needed to develop a CALL system • can be obtained during development of a CALL system • can be obtained using a CALL system • Language Resources obtained give rise to new opportunities: • research • system development LREC 2010, Malta, 22-05-2010

  25. THE END • Website DISCO • lands.let.ru.nl/~strik/research/DISCO/ LREC 2010, Malta, 22-05-2010

  26. Stevin project DISCO • Trainen van spreekvaardigheid • uitspraak, morfologie, syntax • Correct • Voorbeeld Ik loop naar huis • Fouten • Uitspraak Ik lop nar guis • Morfologie Ik lopen naar huis • Syntax Ik naar huis lopen • Fouten automatisch detecteren • m.b.v. spraaktechnologie LREC 2010, Malta, 22-05-2010

  27. DisplayLogic FeedbackGeneration ErrorDetection Grading PromptGenerator Segmentation Words ASR LREC 2010, Malta, 22-05-2010

More Related