230 likes | 359 Views
Facilitating Use of Speech Recognition Software. Summarized By: Vivianne Cardenas EME 2040-Fall 2008. Abstract. This study examined three interventions: 1. physiological, 2. behavioural, 3. pragmatic Designed to facilitate speech recognition software.
E N D
Facilitating Use of Speech Recognition Software Summarized By: Vivianne Cardenas EME 2040-Fall 2008
Abstract • This study examined three interventions: 1. physiological, 2. behavioural, 3. pragmatic • Designed to facilitate speech recognition software. • There were 15 adult participants with dysarthria associated with a variety of aetiological conditions. • The conditions include: cerebral palsy, Parkinson’s disease, and motor neuron disease. • Participants demonstrated systematic improvement in their dictation rates (treatment order did not matter).
Introduction • Voice or speech software provide access for people with physical disabilities. • 13 out of 20 people with dysarthia associated with cerebral palsy achieved accuracy rates. • The rates were 80% to 100% during training tutorial of a speech recognition system. • It took no more that four 1-hour sessions. • Factors were identified that there were successful and unsuccessful users. • Parameters reflected the ability to co-ordinate: respiratory, phonatory, and articulatory.
Introduction Continuation • Measures of vocabulary size, nonverbal problem, solving skills, and reading competency did not predict success in sung the software. • Tasks emphasizing laryngeal and respiratory co-ordination were more difficult. • It was difficult for the people whose dysarthria fit within the mixed categories of FrenchayDysarthria Assessment. • Overall this subgroup was unsuccessful with the software. • Additional practice improved participants dictation rates. • Voice recognition success was characterized by steep increases in correct recognition (dysarthric speakers).
Introduction Continuation • More gradual increases observed over subsequent sessions. • Improvement after second session changes in performance, rather to a software training effect. • Non-disabled participants showed no voice recognition success after second dictation. • Most participants in study showed improvement but dictation rate stayed slow, compared to normal users. • Initial training time with speech recognition software was greater for speakers with cerebral palsy (3.5 hrs. training). • Normal speakers require 1 hour of training.
Introduction Continuation • Research suggests that a reasonable proportion of speakers with dysarthria can use the software successfully. • Efficiency use in limited. • Ongoing improvement associated with performance associated with performance changes within individual users. • Treatment can effect optimal use. • Behavioural Treatment: focuses on modifying a behaviour that an assessment shows to be inadequate. • Physiological Treatment: focuses on improving the range, strength, and speed of musculature (has been identified as impaired during assessment).
Introduction Continuation • Critical for specific movements for speech production. • Pragmatic: involves manipulating factors that may influence speech intelligibility but do not directly control it. • Aims of the Project: 2 main Aims • First- test proposition that people with upper motor neuron type dysarthria are less likely to be successful. • Second- is to determine what type of speech practice constitutes the most efficient method of becoming skilled at using the software.
Method • Participants • 16 Australians (12 men and 4 women) participated. • All spoke Australian dialect of English but one, who spoke with a Sri Lankon background. • English-:main and preferred language of all participants. • Age Range: 18 to 81 years with an average age of 53 years. • One individual withdrew because of progression of his medical condition. • 5 completed tertiary education, 6 secondary, 1 reached primary standard, 4 remain undisclosed.
Method Continuation • Table I. Participant characteristics. • Sex Age MD HL PPVT Ravens WI FDA-TS FDA-I DP Train • 1 M 29 CP normal 98 104 98 238 8.0 UMN 2 • 2 M 48 CP mild 106 73 113 173 5.7 3 • 3 M 77 P severe 92 70 95 198 6.7 3 • 4 M 64 P mod 117 74 129 243 9.0 LMN 2 • 5 M 59 P severe 123 87 117 228 8.7 2 • 6 M 26 CP normal — 68 — 195 8.7 UMN 4 • 7 M 66 P mild 99 68 123 110 2.3 EP 4
Method Continuation • 8 M 81 P mod 102 71 128 197 9.0 LMN 3 • 9 M 46 MND mod 93 104 122 140 1.3 ULMN 4 • 10 M 68 P severe 99 70 103 218 8.7 2 • 11 M 18 CP normal 109 92 89 211 8.0 3 • 12 F 70 P mod 113 85 114 231 8.7 EP 2 • 13 F 71 MND mod 102 83 123 154 3.3 ULMN 3 • 14 F 53 F mod — 68 — 185 6.3 3 • 15 F 31 CP mod 61 67 75 216 8.7 UMN 4 • M 53 101 80 109 196 6.8 2.9 • SD 20 14 13 16 38 2.5 0.8
Method Continuation • MD5medical diagnosis (CP5cerebral palsy, P5Parkinson’s disease, MND5motor neuron disease, • F5Fybromyosis). HL5hearing loss. WI5Word Identification subtest from the Woodcock. FDA-TS5Frenchay • Dysarthria Assessment total score. FDA-I5mean of FrenchayDysarthria Assessment intelligibility ratings. PPVT, • Ravens and WI test scores are standard scores (M5100, SD515). DP5Dysarthria profile based on the FDA • (UMN5upper motor neuron lesion, LMN5lower motor neuron lesion, ULMN5mixed upper and lower motor • neuron lesion, EP5extrapyramidal lesion, where the column is blank the profile was unclear). Train5number of 1- • hour sessions required to reach success in training PowerSecretary speech recognition software.
Method Continuation • Instrumentation • Speech Recognition Software used: PowerSecretary power edition, installed on a 7500/100 power Macintosh Computer. • Uses discrete or word-by-word input and has a large vocabulary. • Behaviorual Sessions • Apple condenser microphone connected to a 750/100 power Macintosh computer. • Computer used to record the speech samples to provide visual feedback during behavioural treatment sessions.
Method Continuation • Physiological Sessions • PowerLab hardware and software, installed on a Macintosh G3 computer used for visual biofeedback during the treatment. • General Procedure • 4 main stages to the project: in order to progress stages participants attended a two 1-hour sessions each week. • Sessions were carried out by 4 final year speech pathology students under the supervision of an experienced speech pathologist.
Method Continuation • Initial Screening and Clinical Assessment • First stage: involved screening, during which standardized test were given, and voice and speech was completed. • All speech samples were recorded using a Sony ECM-44B Electret Condenser lapel microphone and attached was a cassette recorder. • A microphone was clipped to the participants’ shirt 15cm below mouth. • Assessments were taken in a sound attenuated audiology booth to reduce background noise level.
Method Continuation • PowerSecretary Software Training • Second Stage: involved initial PowerSecretary software training. • Participants pronounced a large set of key training words and phrases for speakers adaptation purposes. • Therapy/dictation sessions • Third stage: involved combined speech therapy and dictation sessions for the successful participants. • During each session participants receive 30 minutes of treatment followed by 30 minutes of dictation.
Method Continuation • 15 sessions overall (divided into 3 blocks of 5 sessions). • Dictation session was 8 minutes for the following tasks: letter application, letter to editor, veterinarian memo, and a doctor memo. • Post therapy clinical assessment • Final stage: after all sessions completed participants underwent a clinical assessment of speech and voice. • Using same tasks as stage 1. • Treatment Descriptions • Behaviorual Treatment: facilitate production of utterances with pauses between single words.
Method Continuation • Started with the repetition of one-to four-syllable words first, followed by the compound word and phrase repetition tasks. • Physiological Treatment: techniques were designed to encourage the participant to be relaxed and maintain good posture when sitting at a computer. • Also increases breath control and volume of inspiration. • PowerLab used to display recordings of pulse rate, chest expansion, and vowel prolongation. • Pragmatic Treatment
Results • Initial Training • All 15 participants completed the initial training phase of project. • Participants achieved 100% in the training vocabulary. • 3 participants showed signs of having an upper motor neuron lesion. • 2 were consistent with a lower motor neuron lesion • 2 showed signs of a mixed upper and lower motor neuron lesion. • 2 were consistent with an extrapyramidal lesion.
Results Continuation • Difficulty matching the remaining 6 participants to a specific diagnostic profile. • Effect of therapy on dictation skills • Refer to table I and II. • Post-treatment measures • Comparison between initial and final assessment of speech production and voice was made to see if participants modified their speech and voicing abilities. • Results: Significant increase of vowel prolongation following the treatment programmes.
Results Continuation • Significant increase in the percentage of voicing measure.
Discussion • All participants: successful to some degree in using the powersecretary software. • A relationship was observed between severity of dysarthria and efficiency in training the software. • Physiological Treatment: focused on improving breath support and elongating phonation using objective feedback from the PowerLab. • Pragmatic Treatment: yielded a better dictation response, as compared to the behaviouralprogramme. • Behavioural Treatment: involved spectrographic feedback of speech productions.
Citation • Kathryn Hird and Neville W. Hennessey. 2007. Faciliating use of speech recognition software for people with disabilities: A comparison of three treatments. 211-226.