120 likes | 132 Views
This paper investigates the effectiveness of multi-band frequency compression for reducing spectral masking and improving speech perception in individuals with sensorineural hearing loss. The scheme divides the speech spectrum into bands and compresses the spectral samples towards the band center. Evaluation using the modified rhyme test showed maximum improvement in recognition scores for a compression factor of 0.6.
E N D
DSP 2009 (Santorini, Greece. 5-7 July 2009), Session: S4P, Paper: S4P.1 Multi-Band Frequency Compression for Sensorineural Hearing Impairment P. N. Kulkarni1 P. C. Pandey2 D. S. Jangamashetti3 1, 2 IIT Bombay, India 3Basaveshwar Engg. College, Bagalkot , Kar., India <{pnkulkarni,pcpandey}@ee.iitb.ac.in, dsj1869@rediffmail.com>
ABSTRACT Sensorineural hearing loss is associated with widening of the auditory filters, leading to increased spectral masking and degraded speech perception. Multi-band frequency compression can be used for reducing the effect of spectral masking. The speech spectrum is divided into a number of bands and spectral samples in each of these bands are compressed towards the band center, by a constant compression factor. In the present study, we have investigated the effectiveness of the scheme for different compression factors, in improving the speech perception. Evaluation of the scheme using the modified rhyme test showed maximum improvement in recognition scores for compression factor of 0.6: about 17 % for the normal-hearing subjects under simulated hearing loss, and 6 – 21 % for the subjects with moderate to severe sensorineural hearing loss.
1. INTRODUCTION Sensorineural hearing loss Widening of auditory filters, resulting in increased spectral masking and degradation in speech perception. Multi-band frequency compression for reducing the effect of increased spectral masking Spectrum divided into a number of bands and spectral components in each band compressed towards the center, for enhancing the spectral contrasts.
Earlier work Critical band based compression [Yasu et al., 2002, 2004] ▪ Magnitude spectrum compressed towards center of each critical band and associated with unaltered phase spectrum (segmentation with Hamming window, STFT, spectral modification, and overlap-add synthesis) ▪Moderateimprovement in the VCV recognition score for hearing-impaired subjects (unproc. 35.4%, proc. 38.3%). Objective of the investigation To select the most appropriate combination of segmentation, bandwidth, frequency mapping, and compression factor for analysis-synthesis.
2. SIGNAL PROCESSING Input speech Segmentation ▪ Fixed-frame : 20 ms frames with 50% overlap. ▪ Pitch-synch.: two local pitch period frames aligned to glottal closure instants (GCIs), with one pitch period overlap. Spectral analysis and modification ▪ Zero padding, 1024-point FFT ▪Compression of complex spectral samples in a set of predefined bands towards the center by a fixed CF Proc. speech Resynthesis: IFFT and overlap-add
Factors affecting quality & intelligibility ▪Segmentation ▪ Bandwidth ▪ Frequency mapping ▪ Comp. factor Bandwidth ▪Constant bandwidth (no. of bands : 2 – 18) ▪1/3 octave bandwidth ▪Auditory critical bandwidth (ACB) BW = ACB, Comp. factor = 0.6
Frequency mapping ▪Sample-to-sample ▪Superimposition of spectral samples ▪ Spectral segment Spectral segment mapping Output spectral sample = weighted sum of complex spectral samples in the input frequency segment [a,b] corresponding to the output sample k'. m, n :first and last FFT indices in the segment [a,b].
Processing example ▪ /aka/: (a) unpr. (b) proc. (fixed-frame seg., spectral segment mapping. ACB, CF = 0.6). ▪ Harmonic structure in voiced segments & randomness in unvoiced segments approximately preserved (a) (b) MOS tests (normal hearing subjects) Highest scores for pitch-synch. segmentation, ACB, spectral seg. mapping [Kulkarni et al, Int. J. Speech Tech., vol. 10, pp. 219 - 227, 2007]
3. LISTENING TESTS Modified Rhyme Test (MRT) for quantitative evaluation of speech intelligibility ▪300 CVC words, presented in six test lists with a carrier phrase. ▪Automated test procedure for randomized presentation & recording of response and response time. Experiment I ▪ 6 normal-hearing subjects with simulated hearing loss: Broadband masking noise added to the processed speech with SNR constant on a short time (10 ms) basis ▪SNR: ∞, 6, 3, 0, -3, -6, -9, -12, and -15 dB. ▪10,800presentations per subject (300 words × 4 comp. factors × 9 SNR) Experiment II ▪11 subjects with moderate to severe sensorineural hearing loss (without using their hearing aids). ▪1,200 presentations per subject(300 words × 4 comp. factors)
4. RESULTS Exp. I: Recognition scores (avg. across 6 normal hearing subjects) ▪ Processing improved recognition scores for SNR < 0 dB ▪ Best improvements observed for C.F. = 0.6 (p < 0.001). ▪ Avg. Improvement of 17 % in recognition score for SNR < -6 dB ▪ SNR advantage of 6 dB at about 60% recognition score.
Exp. II : Recognition scores (for 11 hearing- impaired subjects)
5. CONCLUSION For both the group of subjects, max. improvement in recognition scores for CF = 0.6 ● Normal-hearing subjects ▪ Avg. Improvement of 17 % in recognition score for SNR < -6 dB. ▪ SNR advantage of 6 dB. ● Hearing-impaired subjects Improvement in % recognition score: 6 – 21.