190 likes | 334 Views
Building a corpus of pathological speech. Gwen Van Nuffelen Marc De Bodt. Catherine Middag Jean-Pierre Martens. Dutch Corpus of Pathological and Normal Speech. disturbed muscular control due to damage of the nervous system weak, slow, imprecise, uncoordinated movements.
E N D
Building a corpus of pathological speech Gwen Van Nuffelen Marc De Bodt Catherine Middag Jean-Pierre Martens
Dutch Corpus of Pathological and Normal Speech disturbed muscular control due to damage of the nervous system weak, slow, imprecise, uncoordinated movements
Dutch Corpus of Pathological and Normal Speech TL: surgical removal of the larynx and separation of the trachea from the mouth, nose, and esophagus TE, E, electro larynx (servox) PL: partial removal of laryngeal structures, vocal folds
Speakers • native speakers of Dutch • adequate language, cognitive, visual and hearing* abilities
Recordings • Natural, quiet environment ~ clinical setting • No sound treated box • Mini-disc (Sony, MZ-R700) • Microphone • Sony (mouth-microphone distance: 30 cm) • Shure head set • Transferred to a notebook wave file (mono, 44kHz) • 16 kHz
Dutch Intelligibility Assessment (DIA) • Intelligibility at phoneme level • 50 consonant – vowel – consonant words • 3 subtests: • A: initial consonants (19 words) • B: final consonants (15 words) • C: medial vowels/ diphthongs (16 words) • Balanced mix of existing and non-existing (well pronounceable) words • Large pool of test items: 25 lists/ subtest 25*25*25 different tests
DIA 16 year-oldgirl, stroke, dysarthria, PI: 40% 79 year-old male, TL, TE-speech, PI: 68%
top DIA List A10 Intelligibility: percentage of phonemes correctly understood
Annotations DIA • Praat • 2 tiers • Tier 1: target word • Tier 2: fixed frame + perceived phoneme • . VC • CV. • C.C • Orthographic transcriptions
List A Target phoneme: initial consonant Fixed frame: . V C
Articulation assessment • Children • Insufficient reading skills • Logo-Art (Baarda et al, 2001) • Picture naming test • Annotations: • Orthographic • Tier 1: target • Tier 2: perceived utterance (no fixed frame)
Sentences • Motor Speech Profile (Kay Elemetrics) • ‘Wil je liever de thee of de borrel ?’ • ‘Na nieuwjaar was hij weeral hier’ • N= 211 • Orthographic transcriptions • Tier 1 – tier 2; no word boundaries man, no speech pathology 18 year-old male, congenitaldysarthria
Text Marloes and Text • Text ‘Papa en Marloes’ • standardized text • balanced representation of Dutch phonemes • often used in clinical practice • Text • different texts with the same reading level • orthographic transcriptions • 2 tiers • boundaries between sentences
(Semi) Spontaneous speech • Spontaneous • Semi spontaneous: randomly selected sequence of pictures • No annotations available
Future • Gradually increase number samples • DIA validation SPACE intelligibility assessment • DIA sentence level: > 200 control speakers 3*6 sentences + annotations + pathological samples