210 likes | 351 Views
Voice source characteristics in speaker segregation. Patti Adank. Aim project : to establish whether voice source characteristics of speakers can be useful to listeners when attending to a target speaker in a multi-speaker situation.
E N D
Voice source characteristics in speaker segregation Patti Adank
Aim project: • to establish whether voice source characteristics of speakers can be useful to listeners when attending to a target speaker in a multi-speaker situation • Some speaker-related characteristics have been found to be helpful: • Darwin et al. 2003, F0 (pitch) and vocal tract length (VTL) differences between concurrent speakers help listeners attending to the target speaker
Speaker-related differences that aid listeners: • - F0 difference (if > 2 semitones) • Vocal tract length difference (VTL) (if > 1.08) • Effects of F0 and VTL are superadditive Darwin et al. 2003 • Speaker-related differences that might aid listeners: • - style of speech • voice quality: creaky voice, roughness, breathiness • My experiments: • - establish the possible relevance of acoustic aspect of a creaky voice:jitter
Literature: • - McAdams (1989): natural jitter present in speaker’s voice may be helpful for listeners • Ellis (1993): segregate simultaneously presented vowels using jitter differences alone, for a computational model
How could jitter help listeners? • Auditory Scene Analysis • primitive segregations cues • bottom-up • involuntary listening • schema-driven segegation cues (Bregman, 1990) • top-down • voluntary/effortful listening
Pitch = • primitive segregation cue • (Scheffers, 1983, Assmann & Summerfield, 1990 etc…) • + • schema-driven segregation cue • (Darwin et al, 2003)
Hypotheses: • 0. jitter does not aid the auditory system • 1. jitter is only a primitive segregation cue • 2. jitter is a primitive cue AND schema-driven cue • 3. jitter is only a schema-driven segregation cue
Experiments: • 1. one double-vowel experiment with pitch as the experimental factor to replicate earlier results for pitch as a primitive cue • 2. one double-vowel experiment with jitter as the experimental factor to establish if jitter is a primitive cue • 3. An experiment like Darwin et al., with pitch and jitter as factors to establish if jitter is a schema-driven cue
Experiment 1: • - Double-vowel experiment to test pitch effect • - Synthetic vowels (Klat 1990): • AH, EE, ER, OO, OR, 200 milliseconds • - five versions of each vowel: • 100 Hz, +1/4 semitone (st), +1/2 st, +1 st, +2 st
Experiment 2: • - Double-vowel experiment to test jitter effect • - Synthetic vowels (Klat 1990) altered version: • AH, EE, ER, OO, OR, 200 milliseconds • - five versions of each vowel: • 100 Hz, +/-1%, +/-2%, +/-4%, +/-8%
Procedure (1 & 2): • 7 listeners (5 British-English, 2 bilingual) • categorization pre-test (45 stimuli) • experiment 1 (or 2): • presentation double vowel (125 combinations) • select one of 15 options
Hypotheses: • 0. jitter does not aid the auditory system • 1. jitter is only a primitive segregation cue • 2. jitter is a primitive cue AND schema-driven cue • 3. jitter is only a schema-driven segregation cue • 4. jitter is a primitive segregation cue if there is also a pitch difference.
Is there still hope for jitter? • Next experiment: test if jitter is schema-driven cue • Setup as in Darwin et al.: • 2 sentences from same speaker presented simultaneously • attend to target sentence • report on target words • vary jitter and pitch of the sentences