390 likes | 518 Views
Auditory Objects of Attention. Chris Darwin University of Sussex. With thanks to : Rob Hukin (RA) Nick Hill (DPhil) Gustav Kuhn (3° year proj) MRC. Need for sound segregation. Ears receive mixture of sounds
E N D
Auditory Objects of Attention Chris Darwin University of Sussex • With thanks to : • Rob Hukin (RA) • Nick Hill (DPhil) • Gustav Kuhn (3° year proj) • MRC
Need for sound segregation • Ears receive mixture of sounds • We hear each sound source as having its own appropriate timbre, pitch, location • Stored information about sounds (eg acoustic/phonetic relations) concerns a single source
Mechanisms of segregation • Primitive grouping mechanisms based on general heuristics • Schema-based mechanisms based on specific knowledge.
A Paradox • We can attend to sounds coming from a particular direction • everyday experience • Auditory RTs faster to cued side (Spence & Driver, 1994) • Interaural time differences (ITDs) are the main cue to the location of a complex sound (Wightman & Kistler, 1992).
A Paradox On the other hand • ITDs are ineffective at grouping together sounds from a single sound source (Culling & Summerfield, 1995; Darwin & Hukin, 1995)
-600µs +600µs +600µs -600µs Coincidence detection and ITD M S O Left cochlea Right cochlea 2000 Hz 1000 Hz 500 Hz 200 Hz AR EE
Plan • check out Culling & Summerfield for more natural sounds • Show evidence for grouping before across-frequency ITD calculated • show that ITD can be a very powerful sequential grouping cue
ILD condition "Hello, you'll hear the sound X now" Left 600-Hz no 600-Hz Right Target vowel e / / or / / I
Phase Ambiguity 500 Hz: period = 2ms L lags by 1.5 ms L leads by 0.5 ms L R L cross-correlation peaks at +0.5ms and -1.5ms auditory system weighted toone closest to zero
Disambiguating phase-ambiguity • Narrowband noise at 500 Hz with ITD of 1.5 ms (3/4 cycle) heard at lagging side. • Increasing noise bandwidth changes location to the leading side. • Explained by across-frequency consistency of ITD. • (Jeffress, Trahiotis & Stern)
Cross-correlation peaks for noise delayed in one ear by 1.5 ms Resolving phase ambiguity Left ear actually lags by 1.5 ms 500 Hz: period = 2ms 300 Hz: period = 3.3ms L R R L R R L lags by 1.5 ms or L leads by 0.5 ms ? L lags by 1.5 ms or L leads by 1.8 ms ? 800 Actual delay 600 Frequency of auditory filter Hz 400 200 -2.5 -0.5 1.5 3.5 Delay of cross-correlator ms
Segregation by onset-time Synchronous Asynchronous 800 600 Frequency (Hz) 400 200 0 400 0 80 400 Duration (ms) Duration (ms) ITD: ± 1.5 ms (3/4 cycle at 500 Hz)
Segregated tone changes location 20 0 Pointer IID (dB) R L Complex Pure -20 0 20 40 80 Onset Asynchrony (ms)
Segregation by mistuning In tune Mistuned 800 600 Frequency (Hz) 400 200 0 400 0 80 400 Duration (ms) Duration (ms)
Interim Summary • ITD ineffective for simultaneous segregation • Integration of ITD across frequency influenced by grouping cues • Question: Can attention be directed on the basis of ITD to grouped objects?
Attending to one sentence Could you please write the word dog down now …dog... You’ll also hear the sound bird this time
Continuity of Fo vs ITD • Fo differences: 0, 1, 2, 4 semitones • ITD differences: ± 45, 91, 181 µs • Normal: Fo & ITD work together • Switched: Fo & ITD opposed
Summary • ITD ineffective for simultaneous grouping • ITD provides good spatial separation for grouped objects • Monotone pitch contours ineffective for source continuity
New questions • Reverberation? • Natural prosody? • Talker differences?
Vocal tract change Me (m) Shorter vocal-tract (higher formants) Higher pitch Both (-> f)
Effect of reverberation on relative strength of ITD, prosody and vocal tract 100 RT60 = 0 RT60 = 0.5 s 80 60 40 20 0 Fo together Fo original Fo apart Fo original + VT Vocal tract good against reverb ITD = ±91 µs change in % correct by ITD when opposed by prosody
Shadowing sentences Jemma felt stiff and tired after 3 hours in the hot and stuffy room and she would have liked || …to go outdoors for a breath of fresh air We had spent our entire time from Cairo to Luxor in a tiny bus with no proper windows and really wanted || …the air conditioning to be switched on …liked the airconditioning...
50 Same VT Different VT 40 30 20 10 0 Normal Swapped Shadowing results p<0.002 p<0.05 ITD = ±91 µs Switches (against ITD) in shadowing (%) p<0.05 +ITD +ITD +ITD +ITD +Prosody +Prosody -Prosody -Prosody +Vocal Tract -Vocal Tract
Summary • ITD no good for simultaneous grouping • …but great for locating grouped objects • ITD messed up by reverberation • Prosody and speaker characteristics less messed up by reverberation