1 / 33

Can speech technology be useful for people with dysarthria? Speech technology & pathology

Can speech technology be useful for people with dysarthria? Speech technology & pathology. Helmer Strik Language & Speech Dept. of Linguistics Radboud University Nijmegen. Outline. Speech technology & pathology Applications: existing, possible In practice Target groups

moshe
Download Presentation

Can speech technology be useful for people with dysarthria? Speech technology & pathology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Can speech technology be usefulfor people with dysarthria?Speech technology & pathology Helmer Strik Language & Speech Dept. of Linguistics Radboud University Nijmegen

  2. Outline • Speech technology & pathology • Applications: existing, possible • In practice • Target groups • Speech technology & dysarthria • Introduction • Speech recognition for dysarthric speech • Conclusions SPACE symposium

  3. Applications • AAC (Augmentative & Alternative Communication): • Improve communication • Interactive tools: • Training, reading, listening • Assessment: • Diagnosis, monitoring • Therapy SPACE symposium

  4. AAC • Speaking problems • Speech generation • Speech manipulation • Speech recognition (of handicapped) + output (text, speech, talking head, etc.) • Hearing problems • Hearing aids, cochlear implants, etc. • Speech recognition (of others) + output (text,sign language, talking head, etc.) SPACE symposium

  5. ASR & output channel text ASR speech synthesis SPACE symposium

  6. Interactive tools • Speech generation • Reading tools: screen readers, reading pen, text processors, etc. • Writing tools: word prediction, TTS, (dedicated) spell checking • Analysis, manipulation, training • Delayed Auditory Feedback (DAF) and Frequency Altered Feedback (FAF), for stutterers • CAFET: Computer-Aided Fluency Establishment Training • CAPT: Computer Assisted Pronunciation Training SPACE symposium

  7. Delayed Auditory Feedback (DAF) Frequency Altered Feedback (FAF) SPACE symposium

  8. Assessment, therapy • Assessment: diagnosis, monitoring • Therapy • Clinical setting, with expert • Speech analysis + visualization, categorization, etc. • IBM speech viewer • … • Research SPACE symposium

  9. Applications • Amount of applications differs • (from most to fewest): • speech generation • speech analysis, manipulation, etc. • speech recognition SPACE symposium

  10. In practice • Many existing applications • Many more are possible • However, relatively little use • Why? SPACE symposium

  11. In practice • However, relatively little use. Why? • Needed: • Tailor made, flexible applications • Tailor made: taking into account the capabilities & desires of the user + environment • Flexible: the capabilities & desires often change • More user tests & adequacy evaluation • instead of technology improvement & performance evaluation SPACE symposium

  12. Target groups • International Classification of Functioning, Disability and Health (ICF): • Mental functions: aphasia, dyslexia, mental disabilities • Sensory functions: blindness, deafness, both • Voice & speech functions: dysarthria, anarthria, mutism, stuttering • Motorial functions: dyspraxia, apraxia, RSI / UEMSD (Upper Extremity Musculoskeletal Disorders) SPACE symposium

  13. Speech technology & dysarthria • Dysarthria: speech disorder caused by dysfunctioning of nerves and muscles • Many different kinds of dysarthria SPACE symposium

  14. Can speech technology be useful for people with dysarthria? • Yes! • AAC • Interactive tools • Assessment • Therapy SPACE symposium

  15. Can speech technology be useful for people with dysarthria? • Speech generation • Prefer voice similar to their (old) voice • Preferably: own voice • AAC • Manipulation • Speech recognition + output channel • Pronunciation training: Speech recognition, analysis, feedback, etc. SPACE symposium

  16. Speech technology & dysarthria ASR for dysarthric speech • Questions: • How well can dysarthric speech be recognized by a standard (“non-dysarthric”) speech recognizer? • Will the recognition results improve if we train the recognizer on speech of dysarthric speakers? SPACE symposium

  17. Experimental setupSpeakers • Dysarthric: 2 Dutch males, DYS1 & DYS2 • Reference: 2 Dutch males, REF1 & REF2 • Total duration of the speech material (minutes) • DYS 2: speaks more slowly SPACE symposium

  18. Experimental setupSpeech tasks • All four speakers read the same list of items, consisting of four different tasks: • 1. NUM: numbers 0-12 spoken in isolation • 2. PFU: from Polyphone the 50 most Frequent Utterances • 3. PMS: 130 Plomp-Mimpen Sentences (semantically unpredictable) • 4. PRS: 10 Phonetically Rich Sentences SPACE symposium

  19. Experimental setupSpeech tasks • Number of utterances & words per task • The NUM and PRS task were both read three times. SPACE symposium

  20. Experimental setupSpeech recognizer • General specifications • Standard phone based recognizer • 37 context independent phones • 3-state HMM’s • 14 cepstral coeffiecients + delta’s from Melbank freq 350-3400 Hz • 16ms Hamming window, 10 ms step SPACE symposium

  21. Experimental setupExperiments • Lexicon & language model (uni- and bigram) • Based on all words in 4 tasks • Task specific & same for all speakers • Perplexity SPACE symposium

  22. Experimental setupSpeaker Indep. & Dependent • SI: Speaker Independent training material • Polyphone (5000+ speaker Dutch telephone database) • 4022 connected digit strings • 3702 polyphone most frequent items • 20,110 phonetically rich sentences • SD: Speaker Dependent training material • Speakers own speech SPACE symposium

  23. Speaker Independent (SI) Results Word Error Rates (WERs) for SI recognition SPACE symposium

  24. Speaker Independent (SI)Conclusions • REF better than DYS • DYS1 better than DYS2 in short utterances because of speaking rate (table 1) • Results DYS quite reasonable (especially for sentences) because of tight language model SPACE symposium

  25. Speaker Dependent (SD) • Models (also) trained on speech of speakers • Jackknife procedure = semi randomly selected test set = rest = training set SPACE symposium

  26. Speaker Dependent (SD) Results • Word Error Rates (WERs) for the whole test set • for different number of Gaussians (2N) SPACE symposium

  27. Speaker Dependent (SD) Results Word Error Rates (WERs) for SD recognition SPACE symposium

  28. Speaker Dependent (SD) Results • Word Error Rates (WERs) • for SD / SI recognition SPACE symposium

  29. Speaker Dependent (SD)Conclusions • For REF results for SD equal or worse than for SI (counterbalance between own models, but less training material) • For DYS results for SD much better than for SI • DYS2 better than DYS1, almost as good as REF SPACE symposium

  30. ConclusionsASR for dysarthric speech • Results for DYS2 are remarkable • SI: High WERs, esp. for NUM & PFU • SD: sometimes better than REF • Low speaking rate! • Automatic recognition of dysarthric speech is possible. Better results: • Lower speaking rate • Speaker dependent models • Even better: also speaker dependent lexicon SPACE symposium

  31. ConclusionsST & pathology • Applications: • Many already exist • Many more are possible • Needed: • Tailor made, flexible applications • User tests, adequacy evaluation SPACE symposium

  32. References • http://lands.let.ru.nl/TSpublic/strik/pres/ • p97-SPACE.ppt • E. Sanders, M. Ruiter, L. Beijer, H. Strik (2002) Automatic recognition of dutch dysarthric speech: A pilot study. ICSLP-2002, Denver, USA, pp. 661-664. • T. Rietveld & I. Stolte (2005) • Taal- en spraaktechnologie en communicatieve beperkingen SPACE symposium

  33. END SPACE symposium

More Related