200 likes | 341 Views
ICA 2010 : 20th Int. Congress on Acoustics, 23-27 August 2010, Sydney, Australia [Mon, 23rd Aug, R.101, Physiological Acoustics 2, 17:00] Simulation of Increased Masking in Sensorineural Hearing Loss for a Preliminary Evaluation of Speech Processing Schemes D. S. Jangamashetti A. N. Cheeran
E N D
ICA 2010 : 20th Int. Congress on Acoustics, 23-27 August 2010, Sydney, Australia [Mon, 23rd Aug, R.101, Physiological Acoustics 2, 17:00] Simulation of Increased Masking in Sensorineural Hearing Loss for a Preliminary Evaluation of Speech Processing Schemes D. S. Jangamashetti A. N. Cheeran P. N. Kulkarni P. C. Pandey dsj1869@rediffmail.com, ancheeran@vjti.org.in, {pnkulkarni,pcpandey}@ee.iitb.ac.in, http://www.ee.iitb.ac.in/~spilab IIT Bombay, India
OUTLINE • Introduction • Listening tests • Results & discussion • Conclusion
Intro.1/5 • INTRODUCTION • Sensorineural hearing loss • ▪ Increased hearing thresholds • ▪ Reduced dynamic range of hearing & loudness recruitment • ▪ Increased temporal & spectral masking • Speech processing for reducing the effects of increased spectral masking • ▪ Spectral contrast enhancement • ▪ Multi–band frequency compression • ▪ Dichotic binaural presentation
Intro.2/5 Evaluation of speech processing schemes ▪Listening tests on hearing-impaired S’s, for the different combinations of processing parameters: time consuming, tedious, may cause fatigue. ▪Preliminary evaluation for assessing the effects of processing parameters through listening tests conducted on normal-hearing S’s, with simulation of specific characteristics of hearing loss.
Intro. 3/5 Some earlier studies on simulation of hearing loss Villchur (1974):Loudness recruitment simulated by 3–band dynamic range expansion with different ratios.Tested on 4 S’s with unilateral loss (processed stimuli to the normal ear, unprocessed stimuli to the impaired ear). Leek et al. (1987):Elevated hearing thresholds simulated by addition of road-band noise. ter Keurs et al. (1992):Reduced freq. resolution simulated by smoothening of short time spectral envelope by convolving with a Gaussian shaped filter followed by spectral smearing. Tested by presenting processed stimuli to the normal-ear and unprocessed stimuli to the impaired ear.
Intro. 4/5 Dubno & Schaefer (1992): Addition of Spectrally shaped broad-band noise. Comaprision of scores by h.i. S’s & the scores for noise-masked n.h. S’s. Moore & Glasberg (1993): Speech signal split into 13 freq. bands, envelope in each band processed to simulate loudness recruitment. Nejime & Moore (1997): Loudness recruitment & reduced frequency selectivity simulated by filtering the speech signal into different bands, raising the temporal envelop of the filtered signal to a power greater than one followed by smearing of short-time power spectrum.
Intro. 5/5 Present study Effect of increased masking simulated by adding broad-band noise, band-limited to speech frequency range, at a specific SNR with respect to short-time (10 ms) energy of the signal (no noise during silence). Evaluation by conducting consonant recognition tests on n.h. with different levels of masking noise and S’s with moderate sensorineural loss.
List. tests 1/3 2 LISTENING TESTS Consonant recognition tests Phonetically balanced (PB) words Modified rhyme test (MRT) with CVC monosyllabic words VCV utterances with vowel /a/ Presentation level: most comfortable level (MCL) of individual S’s. Performance measures: recognition score (RS), response time (RT)
List. tests 2/3 Phonetically balanced (PB) words ▪Three sets, each having 50 - 60 words of approximately same intensity. ▪7 n.h. S’s, masking noise with SNR of 3, 0, -3, -6, -9 dB. ▪13 S’s with moderate bilateral loss. Modified rhyme test (MRT) ▪300 words, embedded in a carrier phrase “would you write …”, presented randomly in six test lists of 50 words. ▪6 n.h. S’s, masking noise with SNR of 6, 3, 0, -3, -6, -9, -12, -15 dB. ▪11 S’s with moderate bilateral loss.
List. tests 3/3 VCV utterances ▪ 12 consonants /p,b,t,d,k,g,f,v,s,z,m,n / in VCV context with vowel /a/. ▪ 5 n.h. S’s, masking noise with SNR of 6, 3, 0, -3, -6, -9, -12, -15 dB. ▪ 5 h.i. S’s. ▪ Calculation of relative transmission of information for different features.
Results 1/5 3 RESULTS I. PB test results Mean (7 n. h. S’s) response time & recognition score RT increased from 2.09 s for no-noise to 2.83 s at -9 dB SNR. RS decreased from 99.8 % for no-noise to 23.9 % at -9 dB SNR . Hearing-impaired subjects RT: 2.1 – 6.6 s, mean: 3.05 s. RS: 20.6 –90.1 %, mean 62.7 %.
Results 2/5 II. MRT Results Mean (6 n. h. S’s) response time & recognition score RT increased from 2.64 s for no-noise to 3.45 s at -15 dB SNR. RS decreased from 97.1 % for no-noise to 45.3 % at -15 dB SNR. Hearing-impaired subjects RT: 3.47 – 4.10 s, mean: 3.80 s. RS: 50.3 – 67.3 %, mean: 60.8 % .
Results 3/5 III. VCV test results Mean (5 n. h. S’s) response time (RT), recognition score, (RS) and rel. information transmitted for overall and feature groupings
Results 4/5 RS vs. SNR for three types of test material ▪ Scores for PB words lower than those for VCV and MRT. ▪ Matching of mean RS of h.i. S’s and n.h. S’s PB words: at SNR = -3 dB. MRT & VCV: at SNR = -9 dB.
Results 5/5 Equivalent SNR for VCV test (SNR for matching the n.h. score to the the h.i. score) • Effects of masking on the reception of features • Maximal on place and duration. • Moderate on nasality and manner. • Negligible on voicing.
Concl. 1/1 • CONCLUSION • Addition of broad-band noise with constant SNR on short-time (10 ms) basis, simulated the effect of increased temporal & spectral masking. • Simulation may be useful in preliminary evaluation and optimizing the processing parameters in developing speech processing schemes for improving speech perception by persons with sensorineural hearing loss.
D. S. Jangamashetti, A. N. Cheeran, P. N. Kulkarni, P. C. Pandey, “Simulation of increased masking in sensorineural hearing loss for a preliminary evaluation of speech processing schemes”, Proc. 20th International Congress on Acoustics ( ICA 2010), 23-27 August 2010, Sydney, Australia. Abstract -- Sensorineural loss is characterized by increased hearing threshold, reduction in the dynamic range of hearing and loudness recruitment, and increased temporal and spectral masking, resulting in degraded speech perception. Several techniques including spectral contrast enhancement, multi-band frequency compression, and dichotic binaural presentation have been investigated for reducing the adverse effects of increased masking. Assessment of speech processing techniques and optimization of processing parameters involves listening tests on hearing-impaired listeners. These tests are time consuming and may cause a fatigue, particularly in elderly subjects. A simulation of hearing loss, by processing the speech signal through a model of the loss characteristics, is useful in conducting the listening tests on normal-hearing subjects, for a preliminary evaluation of the schemes and particularly for selecting the processing parameters. The present study used addition of broad-band noise, band-limited to speech frequency range, at a specific SNR with respect to short-time (10 ms) energy of the signal. Different levels of loss were simulated by varying the SNR. In this simulation, no noise gets added during silence segments. Listening tests to assess the loss simulation were conducted using three types of test material: vowel-consonant-vowel (VCV) utterances, phonetically balanced word lists, and modified rhyme test. Recognition score from subject responses was used as a measure of speech intelligibility and response time was used as a measure of load on the perception process. For all the three test materials, decrease in the recognition scores and increase in response times for normal- hearing subjects showed the same pattern as the corresponding results for subjects with moderate-to-severe sensorineural loss. A relative information transmission analysis of the stimulus-response confusion matrices for VCV utterances showed that the simulated loss did not affect reception of voicing and nasality features and it had maximum adverse effect on the reception of place and duration features, indicating that the addition of broadband noise with constant SNR with respect to short-time signal energy simulated an increased spectral and temporal masking.
REFERENCES 1. B. C. J. Moore, An Introduction to the Psychology of Hearing. 4th ed. (Academic, London, 1997). 2. J. M. Pickett, The Acoustics of Speech Communication: Fundamentals, Speech Perception Theory, and Technology, (Allyn Bacon, Boston, Mass., 1999). 3. B. R. Glasberg and B. C. J. Moore, “Auditory filter shapes in subjects with unilateral and bilateral cochlear impairments,” J. Acoust. Soc. Am. 79, 1020 – 1033 (1986). 4. J. R. Dubno and A. B. Schaefer, “Comparison of frequency selectivity and consonant recognition among hearing-impaired and masked normal-hearing listeners,” J. Acoust. Soc. Am. 91 Pt.1., 2110–2121 (1992). 5. H. T. Bunnel, “On enhancement of spectral contrast in speach for hearing-impaired listeners,” J. Acoust. Soc. Am. 88, 2546-2556 (1990). 6. T. Baer, B. C. J. Moore, and S. Gatehouse, “Spectral contrast enhancement of speech in noise for listeners with sensorineural hearing impairment: effects on intelligibility, quality, and response times,” J. Rehabil. Res. Dev. 30, 49-72 (1993). 7 I. Cohen, “Speech spectral modeling and spectral enhancement based on autoregressive conditional heteroscedasticity models,” Signal Processing. 86, 698-709 (2006). 8 K. Yasu, K. Kobayashi, K. Shinohara, M. Hishitani, T. Arai, and Y. Murahara, “Frequency compression of critical band for digital hearing aids,” Proc. China-Japan Joint Conf. Acoustics. 159 – 162 (2002). 9 P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti, D. “Multi-band frequency compression for reducing the effects of spectral masking,” Int. J. Speech Tech. 10, 219-227 (2007). 10 P. E. Lyregaard, “Frequency selectivity and speech intelligibility in noise,” Scand. Audiol. Suppl. 15, 113 – 122 (1982). 11 T. Lunner, S. Arlinger, J. Hellgren “8-channel digital filter bank for hearing aid use: preliminary results in monaural, diotic, and dichotic modes,” Scand. Audiol. Suppl. 38, 75 – 81 (1993). 12 D. S. Chaudhari, and P. C. Pandey, “Dichotic presentation of speech signal with critical band filtering for improving speech perception,” Proc. IEEE ICASSP, 3601 – 3604 (1998). 13 A. Murase, F. Nakajima, S. Sakamoto, Y. Suzuki, and T. Kawase, T. “Effect and sound localization with dichotic-listening hearing aids,” Proc. 18th Int. Congress Acoust. (ICA), Kyoto, Japan, II-1519 – 1522 (2004). 14 E. Villchur, “Simulation of the effect of recruitment on loudness relationships in speech,” J. Acoust. Soc. Am. 56, 1601-1611 (1974). 15 E. Villchur, “Electronic models to simulate the effect of sensory distortions on speech perception by the deaf,” J. Acoust. Soc. Am. 62, 665-674 (1977).
16 B. C. J. Moore, and B. R. Glasberg. “Simulation of the effects of loudness recruitment and threshold elevation on the intelligibility of speech in quiet and in a background of speech,” J. Acoust. Soc. Am. 94, 2050–2062 (1993). 17 M. ter Keurs, J. M. Festen, and R. Plomp, “Effect of spectral envelope smearing on speech reception. I,” J. Acoust. Soc. Am. 91, 2872 – 2880 (1992). 18 Y. Nejime, and B. C. J. Moore, “Simulation of the effect of threshold elevation and loudness recruitment combined with reduced frequency selectivity on the intelligibility of speech in noise,” J. Acoust. Soc. Am. 102, 603 – 615 (1997). 19 L. E. Humes, B. Espinoza-Varas, and C. S. Watson, “Modelling sensorineural hearing loss-I. Model and retrospective evaluation,” J. Acoust. Soc. Am. 83, 188 – 202 (1988). 20 M. R. Leek, M. F. Dorman, and Q. Summerfield, “Minimum spectral contrast for vowel identification by normal-hearing and hearing-impaired listeners,” J. Acoust. Soc. Am. 81, 148 – 154 (1987). 21 D. A. Nelson, S. J. Chargo, J. G. Kopun, and R. L. Freyman, “Effect of forward-masked psychophysical tuning curves in quiet and noise,” J. Acoust. Soc. Am. 88, 2143 – 2151 (1990). 22 W. Jesteadt. Modeling Sensorineural Hearing Loss. (Lawrence Erlbaum, Mahwah, New Jersey, 1997). 23 S. Gordon-Salant and P. J. Fitzgibbons, “Temporal factors and speech recognition performance in young and elderly listeners,” J. Speech Hear. Res. 36, 1276–1285 (1993). 24 G. A. Miller, and P. E. Nicely, “An analysis of perceptual confusions among some English consonants,” J. Acoust. Soc. Am. 72, 338 – 352 (1955). 25 W. D. Voiers, “Evaluating processed speech using the diagnostic rhyme test,” Speech Tech. 55, 30 – 39 (1983). 26 A. S. House, C. E. Williams, M. H. L. Hecker, and K. D. Kryter, “Articulation testing methods: consonantal differentiation with closed-response set,” J. Acoust. Soc. Am. 37, 158 – 166 (1965). 27 K. D. Kryter, and E. C. Whitman, “Some comparisons between rhyme and PB-word intelligibility tests,” J. Acoust. Soc. Am.37, 1146 (1965). 28 S. Gatehouse, and J. Gordon, J. “Response times to speech stimuli as measures of benefit from amplification,” Br. J. Audiol. 24, 63 – 68 (1990). 29 F. Apoux, O. Crouzet, and C. Lorenzi, “Temporal envelope expansion of speech in noise for normal-hearing and hearing-impaired listeners: Effects on identification performance and response times,” Hear. Res. 153, 123 – 131 (2001). 30 A. N. Cheeran, Speech processing with dichotic presentation for binaural hearing aids for moderate bilateral sensorineural loss, Ph.D. Thesis, School of Biosciences and Bioengineering, Indian Institute of Technology Bombay, India, (2005). 31 D. S. Jangamashetti, Binaural dichotic presentation to reduce the effects of temporal and spectral masking due to sensorineural hearing loss, Ph.D. Thesis, Dept. of Elect. Engg., Indian Institute of Technology Bombay, India (2003).