170 likes | 274 Views
Towards a Cohort-Selective Frequency-Compression Hearing Aid. Marie Roch ¤ , Richard R. Hurtig ¥ , Jing Lui ¤ , and Tong Huang ¤. ¤. ¥. Sensorineural Hearing loss. Most common type of hearing loss Affects > 20 million in the US alone Caused by physiological problems in the cochlea.
E N D
Towards a Cohort-Selective Frequency-Compression Hearing Aid Marie Roch¤, Richard R. Hurtig¥, Jing Lui¤, and Tong Huang¤ ¤ ¥
Sensorineural Hearing loss • Most common type of hearing loss • Affects > 20 million in the US alone • Caused by physiological problems in the cochlea
Traditional Hearing Aids • Amplification of frequency bands • Amplitude compression • Works best in situations with high SNR
Problems With Traditional Methods • Simple amplification insufficient • Individuals with severe hearing loss cannot perceive formants “Where were you while we were away” Harrington and Cassidy 1999, p. 110
Preserving the formants • Frequency domain compression [Turner & Hurtig 1999] permits preservation of formants
Effectiveness • Clinical study of 15 hearing-impaired listeners showed improvement when listening to different groups • female talkers: 45% improvement • male talkers: 20% improvement Female Talker- Uncompressed Female Talker- Compressed
Challenges • Not all voices require the same level of compression • Single setting leads to inappropriate levels of compression
Adaptive thresholds • Decision-based control mechanism • Establish cohorts and compress according to cohort class. • Some possible cohorts: • Phonological units • Pitch • Speaker “gender”
Gender-based classifier • Selected “gender” for first study. • Female, Male, Child • Classifier output more stable than with phonological approaches. • Broad support in the literature for the ability of both humans and machines to do this.
Classifier • Gaussian mixture models • Features extracted from 25 ms windows shifted every 10 ms • Energy • 12 Mel-filtered cepstral coefficients (MFCC) • Time-derivatives of Energy & MFCC
Conversational telephone speech Band-limited 8 kHz Mu-law encoded Endpointed with the NIST/Kubala endpointer Train Single sides of same-gender phone calls 25 male & female Test 87 annotated cross-gender phone calls About 7 hours of calls (~5 min. each) LDC SPIDRE Corpus
Many errors occurred in fricatives which have high frequency energy Error analysis telephone bandwidth
Evalution on TIMIT • 630 speakers, clean speech 16 kHz corpus • Train: 25 male, 25 female. Test 413 male, 167 female. SPIDRE TIMIT
Median Smoothing (SPIDRE) median smoothed
Conclusions & Future Work • Classifier-based control systems • feasible • can be applied to other signal enhancement algorithms • need not be limited to the cohorts presented today (e.g. auditory scene analysis)