1 / 18

STARDUST PROJECT – Speech Recognition for People with Severe Dysarthria

STARDUST PROJECT – Speech Recognition for People with Severe Dysarthria. Mark Parker Specialist Speech and Language Therapist. Project Team. DoH NEAT University of Sheffield Barnsley District General Hospital Prof P Enderby/ M Parker – Clinical Speech Therapy

Download Presentation

STARDUST PROJECT – Speech Recognition for People with Severe Dysarthria

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. STARDUST PROJECT – Speech Recognition for People with Severe Dysarthria Mark Parker Specialist Speech and Language Therapist

  2. Project Team • DoH • NEAT • University of Sheffield • Barnsley District General Hospital • Prof P Enderby/ M Parker – Clinical Speech Therapy • Prof P Green/ Dr Athanassios Hatzis – Computer Sciences • Prof M Hawley/ Dr Simon Brownsall – Medical Physics

  3. What is Dysarthria? • A neurological motor speech impairment characterised by slow, weak, imprecise and/or uncoordinated movements of the speech musculature. • May be congenital or acquired • 170/100 000 (Emerson & Enderby 1995)

  4. Severity Rating • Typically based on ‘intelligibility’ • ‘…the extent a listener understands the speech produced…’ (Yorkston et al, 1999) • Not a pure measure – interaction of events • Mild 70-90% • Moderate 40-70% • Severe 10-40%

  5. Aim • VRS used to access other technology • Many of the people with severe dysarthria will have associated severe physical disability • ECA operated with switching systems • slow, laborious, positioning • VRS to supplement or replace switching

  6. Background • Voice recognition systems • commercially available packages -mobile phones, WP packages-Dragon Dictate • Continuous vs Discrete • Normal speech - with recognition training can get >90% recognition rates (Rose and Galdo, 1999) • Dysarthric speech - mild 10-15% lower recognition rates (Ferrier, 1992), • Declining rapidly as speech deteriorates 30-40% single words (Thomas-Stonell, 1998)- functionally useless

  7. Intelligibility vs Consistency • Difference between machine recognition and human perception • ‘Normal’ speech may be 100% intelligible and with a narrow band of differences across time (consistency). • ‘Severe’ dysarthria may be completely unintelligible but may show consistency of key elements (or not)

  8. Development of the system • 10-12 volunteers - severe dysarthria and physical disability • Speech <30% intelligibility rating • Video/DAT recording/computer sampling • Assessing for the range of phonetic contrasts that can be achieved

  9. Development of a system (2) • Discrete system - the number of contrasts that can be achieved will determine the number of commands that the VRS can handle • Don’t need intelligibility - need consistency • Determine what word/sound/phonetic contrast will represent what command

  10. Development of a system (3) • Train the VRS - neural networks and hidden Markhov modelling • Speech consistency training • Implement the system

  11. Current position • Software development – sophisticated recording and data logging facility to be combined with ‘consistency’ measure and spectography package. • Developing ‘user friendliness’ and possibility of ‘remote’ usage. • Identifying & Recording EC commands • ‘Labelling’ the sample • Attempting to define measures of baseline consistency at an ‘acoustic’ level • Experimenting with recognition accuracy of commercially available product - Sicare

  12. Labelling • Breaking an utterance into component parts • To establish the extent of variance over time

  13. Sicare testing • Recognition rates compatible with previous research • Begins to illustrate the points at which a recogniser becomes ‘confused’ • May illustrate the areas where distinction has to be made • May start to illustrate some of the key acoustic factors that are crucial in dysarthric speech and VR • Non adapted commercial product functionally useless for this population

  14. Subsidiary Questions • Is dysarthric speech consistent? • Does the underlying acoustic/soundwave pattern contain consistent differences in contrasts that are not perceptually distinguishable? • Can consistency be trained in the absence of intelligibility? • Does increasing consistency increase intelligibility?

  15. Normal speech “alarm” 1&2

  16. Normal speech “alarm” 2

  17. Normal speech “television”

  18. Dysarthric speech “television”

More Related