270 likes | 281 Views
Explore speech enhancement techniques for better perception by individuals with hearing loss. Includes innovative healthcare instrumentation and industrial applications. Investigate binaural dichotic presentation and its benefits on speech perception.
E N D
SPEECH PROCESSING FOR BINAURAL HEARING AIDS Dr P. C. Pandey EE Dept., IIT Bombay Feb’03
R&D activities in SPI Lab, EE Dept, IIT Bombay • Speech & hearing • Healthcare instrumentation • Impedance cardiography • Industrial instrumentation
Speech & hearing • Speech processing for improving perception by persons with sensori-neural hearing loss: - Consonantal enhancement (with Prof SD Agashe) - Binaural dichotic presentation • Vocal tract shape estimation for speech training of deaf children • Speech synthesis and study of phonemic features using HNM • Cancellation of background noise in alaryngeal speech using spectral subtraction
Healthcare instrumentation • Low cost diagnostic audiometer • Impedance glottograph for voice pitch • Impedance cardiograph for sports medicine. • Intravenous drip rate indicator • Communicator for children with cerebral palsy (with Prof GG Ray) • Non-invasive ultrasonic thermometry system (with Prof T Anjaneyulu) • Myoelectric hand (with Prof SR Devasahayam & R Lal)
Impedance cardiography Signal processing for improving the estimation of stroke volume from impedance cardiogram Industrial Instrumentation Noninvasive m/s of single phase fluid flow using ultrasonic crosscorrelation technique(with Prof T Anjaneyulu) Online measurement of dielectric dissipation factor for condition monitoring of high voltage insulation (with Prof SV Kulkarni)
Speech Processing for Binaural Hearing Aids • Hearing system • Outer ear Middle ear Inner ear Cochlear nerve • Brain Hearing impairments • Conducrtive • Sensorineural • Central • Functional Sensory Aids for the hearing impaired • Hearing aids • Cochlear prosthesis • Visual & tactile aids
Causes of sensorineural loss • • Loss of sensory hair cells in cochlea • • Degeneration of auditory nerve fibers • Characteristics of sensorineural loss • • Frequency dependent shifts in hearing thresholds • • Reduced dynamic range, loudness recruitment • • Poor frequency selectivity & increased spectral masking • • Reduced temporal resolution & increased temporal masking
Effects of increased spectral masking • Smearing of spectral peaks and valleys due to broader auditory filters • Reduction of internal spectral contrast • Reduced discrimination of consonantal place feature Effects of increased temporal masking • Forward and backward masking of weak segments by strong ones • Reduced ability to discriminate sub-phonemic segments like noise bursts, voice-onset-time, and formant transitions
Speech processing for dichotic presentation for • binaural hearing aids to reduce the effects of masking • Masking takes place at the peripheral level of the auditory system • Information from the two ears gets integrated at higher levels in the perception process • Binaural dichotic presentation for persons with bilateral residual hearing: - Speech signal split in a complementary form, - Signal components likely to mask each other presented to different ears, - - Information integrated at higher levels, for better speech perception
Binaural dichotic presentation schemes ·Spectral splitting Filtering by 2 complementary comb filters: better place reception ·Temporal splitting Gating by 2 complementary fading functions: better duration reception ·Combined splitting Processing by 2 time-varying comb filters All the sensory cells of the basilar membrane get periodic relaxation from stimulation: better perception of consonantal duration, place, and other features
w1(n) s1(n) s(n) s2(n) w2(n) w1(n) N L M M n w2(n) L N n TEMPORAL SPLITTING WITH TRAPEZOIDAL FADING Temporal splitting of the signal for dichotic presentation using w1(n) and w2(n) Inter-aural switching period = 20 ms, Duty cycles = 70%, Transition durations = 0, 1, 2, 3 ms Inter-aural fading with trapezoidal transition and inter-aural overlap
Investigations with spectral splitting • Auditory filter bandwidth based comb filters 18 bands over 5 kHz, 256 coefficient linear phase filters, designed using frequency sampling technique • Listening tests with hearing impaired subjects: improvement in response time, recognition scores, & reception of place feature • Better results with perceptually balanced filters 1 dB ripple, 30 dB attenuation, 4-6 dB crossover • Filters with personalized frequency response Overall improvement, but not particularly for place
Combined splitting with time-varying filters s1(n) Time varying comb filter 1 1 m/2 +2 m/2 +1 m m/2 s(n) Magnitude set of filter coefficients m 1 2 2 s2(n) Time varying comb filter 2 1 Sweep cycle duration = 20 ms. With m shiftings, each pair of comb filter processes for 20/m ms Frequency
Inten. dB 5 0 4 3 2 Frequency (kHz) 1 0 -40 0 5 10 15 20 25 30 Time in ms Inten. dB (a) 5 0 4 3 Frequency (kHz) 2 1 0 -40 0 5 10 15 20 25 30 Time in ms (b) An idealized representation of magnitude response of the pair of time-varying comb filters using 4 shiftings for the (a) left ear (b) right ear.
1 4 Magnitude (dB) 3 2 1 Normalized frequency
Time-varying comb filters Set of linear phase 256-coeff. FIR filters with pre-calculated coefficients (designed using iterative use of frequency sampling technique). Comb filter responses optimized for min. perceived spectral distortion: low passband ripple & high stopband attenuation, inter-band crossover gains adjusted for loudness balance. Pass band ripple < 1 dB, Stop band attenuation > 30 dB Gain at inter-band crossovers: -4 to -6 dB Sweep cycle duration : 20 ms Number of shiftings: 2, 4, 8, 16
Listening tests for evaluation of the schemes Test material: Closed set of 12 VCV syllables, formed with consonants / p, t, k, b, d, g, m, n, s, z, f, v / and vowel / a/ Subjects & listening condition: • Normal hearing subjects with loss simulated by Gaussian noise with short-time (~10 ms) SNRs of6 : -15 dB. MCL( 70–75 dB SPL) • Hearing impaired subjects with bilateral sensorineural loss. MCL. Performance measurement • Response time statistics • Stimulus-response confusion matrix • Recognition scores • Rel. information trans. of consonantal features
Acoustically Isolated Chamber s1(t) Lowpass Filter and Audio Amplifier PCL-208 D/A Ports Subject terminal s2(t) PC : Lowpass Filter and Audio Amplifier Subject RS232C Listening test set-up
Conclusions • • All the three schemes improve response time, recognition scores, & rel. info. tr. for overall and various speech features. • • Extent of improvement with a scheme related to nature of the loss • - Severe high frequency hearing loss : • Max. improvement with temporal splitting (17.9%). • - Symmetrically low frequency hearing loss and symmetrically sloping high frequency hearing loss: max improvement with spectral splitting (17.5%) & combined splitting with 8 shiftings (20.5%). • Asymmetrical high frequency loss: temporal splitting (7.6%) & combined splitting (7.6%) • (contd.)
• Spectral splitting more effective in reducing perceptual load. • • Overall max improvement in rec. scores with combined splitting • with 8 shiftings. • • Temporal splitting mainly improved the duration feature perception. • • Spectral splitting mainly improved the the place feature perception. • • Combined splitting with 8 improved perception of both duration and place. • • Reception of the relatively robust consonantal features • (voicing, manner, and nasality) not adversally affected by splitting. • • Personalized filter response gives additional improvement
Next • Listening tests with a larger number of S’s to establish relationship between processing parameters & nature of loss. • Individualized multi-band compression. • Implementation of the processing schemes as part of wearable hearing aids, with personalized parameter setting. • Effect of binaural dichotic listening on non-speech signals & source localization to be investigated. • Investigations with combination of consonant enhancement with dichotic presentation.