1 / 33

Ecophon Seminar, Sept. 2006

Ecophon Seminar, Sept. 2006. 2. Past: STI (1971, 1980), Envelope Spectrum (1972), MTF (1975), RASTI (1979), prediction by ray-tracing (1981), IEC 60268-16 (1988, 1998) Present: Revised STI (1992, 1999, 2002), STIPA (2001), IEC 60268-16 (2003), ISO 9921 (2003) New and Future: Binaural STI, imp

turner
Download Presentation

Ecophon Seminar, Sept. 2006

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Ecophon Seminar, Sept. 2006 1 Presentation to be given at the 5th Ecophon International Acoustician Seminar, 27-28th September 2006, Helsingborg Sweden.Presentation to be given at the 5th Ecophon International Acoustician Seminar, 27-28th September 2006, Helsingborg Sweden.

    2. Ecophon Seminar, Sept. 2006 2 Past: STI (1971, 1980), Envelope Spectrum (1972), MTF (1975), RASTI (1979), prediction by ray-tracing (1981), IEC 60268-16 (1988, 1998) Present: Revised STI (1992, 1999, 2002), STIPA (2001), IEC 60268-16 (2003), ISO 9921 (2003) New and Future: Binaural STI, improvement measurement STI from speech signal, non-native speech, vocoders Past, present and future of the STI is documented by a reference list based on important publications covering the three time intervals. We selected 21 publications for this purpose. Reference list (small list) for past and present research. For looking into the future some recent publications are added. Houtgast, T., Steeneken, H.J.M. (1971). Evaluation of speech transmission channels by using artificial signals. Acustica 25(1971) 355-367. Houtgast, T., Steeneken, H.J.M. (1972). Envelope spectrum and intelligibility of speech in enclosures. Proc. 1972 Conference on Speech Communication and Processing, April 1972: 392-395. Steeneken, H.J.M., Houtgast, T. (1973). Intelligibility in telecommunication derived from physical measurements. Rapports et Textes Symposium Intelligibilité de la Parole, November 12-15, Liege 1973: 316-324. Houtgast, T., Steeneken, H.J.M. (1973). The modulation transfer function in room acoustics as a predictor of speech intelligibility. Acustica 28(1973) 66-73. Steeneken, H.J.M., Houtgast, T. (1975). MTF as a physical measure of the quality of communication channels. Textes des Conferences de Colloque nr. 1: l'Acoustique dans les Telecommunications", Paris 1975: 351-359. Houtgast, T., Steeneken, H.J.M. (1977). Speech intelligibility in rooms; reverberation curve or modulation transfer function? Proc. 9th International Congress on Acoustics, July, Madrid 1977: 92. Steeneken, H.J.M., Houtgast, T. (1979). Measuring ISO-intelligibility contours in auditoria. Proc. 3rd Symp. of F.A.S.E. on Building Acoustics, September, Dubrovnik 1979: 85-88. (RASTI). Steeneken, H.J.M., Houtgast, T. (1980). A physical method for measuring speech-transmission quality. J. Acoust. Soc. Am. 67(1980) 318-326. Rietschote, H.F., Houtgast, T., Steeneken, H.J.M. (1981). Predicting speech intelligibility in rooms from the modulation transfer function. IV. A ray-tracing computer model. Acustica 49(1981) 245-252. 1981 829. Steeneken, H.J.M., Houtgast, T. “The temporal envelope spectrum and its significance in room acoustics”. Proc. 11th ICA, Vol 7. 85-88 Paris (1983). Houtgast, T., Steeneken, H.J.M. (1984). A multi-language evaluation of the RASTI-method for estimating speech intelligibility in auditoria. Acustica 54(1984) 185-199. Houtgast, T., Steeneken, H.J.M. (1985). The modulation transfer function in room acoustics. Bruël & Kjear Technical Review (1985) 3-12. Steeneken, H.J.M., Houtgast, T. (1985). A tool for evaluating auditoria. Bruël & Kjaer Technical Review (1985) 13-39. Steeneken, H.J.M., and Houtgast, T. (1999). “Mutual dependence of the octave-band weights in predicting speech intelligibility”,. Speech communication, 1999, vol.28, 109-123. Steeneken, H.J.M., Houtgast, T. (2002). Validation of the revised STIr method. Speech Communications, vol.38, 2002, p 413-425. Steeneken, H.J.M., Houtgast, T. (2002). Phoneme-group specific octave-band weights in predicting speech intelligibility. Speech Communications, vol. 38, 2002, p 399-411. Wijngaarden, S.J. van, Steeneken, H.J.M, Houtgast, T. , and Bronkhorst, A.W.(2002). “Using the Speech Transmission Index to predict the intelligibility of non-native speech,” J. Acoust. Soc. Am. 111, 2366. Drullman, R., Wijngaarden, S.J. van. “New directions for a speech-based speech transmission index”. J. Acoust. Soc. Am. 111(4), 1906-1916 (2002). Wijngaarden, S.J. van, Verhave, J.A.. “Recent advances in STI measuring techniques”. Proc. IOA, Vol. 28. Pt.6, 2006 Wijngaarden, S.J., and Drullman, R. “Development of a binaural speech Transmission Index (STI)”. J. Acoust. Soc. Am. 119(5). 3442. (2006). Gils, B.J.M.C. van, and Wijngaarden, S.J. van,. “Objective measurement of Speech Transmission Quality of Vocoders by means of the Speech transmission Index”. NATO-symposium RTO-MP-HFM-123. In “New directions for improving audio effectiveness” Amersfoort, the Netherlands. (2005). Past, present and future of the STI is documented by a reference list based on important publications covering the three time intervals. We selected 21 publications for this purpose. Reference list (small list) for past and present research. For looking into the future some recent publications are added. Houtgast, T., Steeneken, H.J.M. (1971). Evaluation of speech transmission channels by using artificial signals. Acustica 25(1971) 355-367. Houtgast, T., Steeneken, H.J.M. (1972). Envelope spectrum and intelligibility of speech in enclosures. Proc. 1972 Conference on Speech Communication and Processing, April 1972: 392-395. Steeneken, H.J.M., Houtgast, T. (1973). Intelligibility in telecommunication derived from physical measurements. Rapports et Textes Symposium Intelligibilité de la Parole, November 12-15, Liege 1973: 316-324. Houtgast, T., Steeneken, H.J.M. (1973). The modulation transfer function in room acoustics as a predictor of speech intelligibility. Acustica 28(1973) 66-73. Steeneken, H.J.M., Houtgast, T. (1975). MTF as a physical measure of the quality of communication channels. Textes des Conferences de Colloque nr. 1: l'Acoustique dans les Telecommunications", Paris 1975: 351-359. Houtgast, T., Steeneken, H.J.M. (1977). Speech intelligibility in rooms; reverberation curve or modulation transfer function? Proc. 9th International Congress on Acoustics, July, Madrid 1977: 92. Steeneken, H.J.M., Houtgast, T. (1979). Measuring ISO-intelligibility contours in auditoria. Proc. 3rd Symp. of F.A.S.E. on Building Acoustics, September, Dubrovnik 1979: 85-88. (RASTI). Steeneken, H.J.M., Houtgast, T. (1980). A physical method for measuring speech-transmission quality. J. Acoust. Soc. Am. 67(1980) 318-326. Rietschote, H.F., Houtgast, T., Steeneken, H.J.M. (1981). Predicting speech intelligibility in rooms from the modulation transfer function. IV. A ray-tracing computer model. Acustica 49(1981) 245-252. 1981 829. Steeneken, H.J.M., Houtgast, T. “The temporal envelope spectrum and its significance in room acoustics”. Proc. 11th ICA, Vol 7. 85-88 Paris (1983). Houtgast, T., Steeneken, H.J.M. (1984). A multi-language evaluation of the RASTI-method for estimating speech intelligibility in auditoria. Acustica 54(1984) 185-199. Houtgast, T., Steeneken, H.J.M. (1985). The modulation transfer function in room acoustics. Bruël & Kjear Technical Review (1985) 3-12. Steeneken, H.J.M., Houtgast, T. (1985). A tool for evaluating auditoria. Bruël & Kjaer Technical Review (1985) 13-39. Steeneken, H.J.M., and Houtgast, T. (1999). “Mutual dependence of the octave-band weights in predicting speech intelligibility”,. Speech communication, 1999, vol.28, 109-123. Steeneken, H.J.M., Houtgast, T. (2002). Validation of the revised STIr method. Speech Communications, vol.38, 2002, p 413-425. Steeneken, H.J.M., Houtgast, T. (2002). Phoneme-group specific octave-band weights in predicting speech intelligibility. Speech Communications, vol. 38, 2002, p 399-411. Wijngaarden, S.J. van, Steeneken, H.J.M, Houtgast, T. , and Bronkhorst, A.W.(2002). “Using the Speech Transmission Index to predict the intelligibility of non-native speech,” J. Acoust. Soc. Am. 111, 2366. Drullman, R., Wijngaarden, S.J. van. “New directions for a speech-based speech transmission index”. J. Acoust. Soc. Am. 111(4), 1906-1916 (2002). Wijngaarden, S.J. van, Verhave, J.A.. “Recent advances in STI measuring techniques”. Proc. IOA, Vol. 28. Pt.6, 2006 Wijngaarden, S.J., and Drullman, R. “Development of a binaural speech Transmission Index (STI)”. J. Acoust. Soc. Am. 119(5). 3442. (2006). Gils, B.J.M.C. van, and Wijngaarden, S.J. van,. “Objective measurement of Speech Transmission Quality of Vocoders by means of the Speech transmission Index”. NATO-symposium RTO-MP-HFM-123. In “New directions for improving audio effectiveness” Amersfoort, the Netherlands. (2005).

    3. Ecophon Seminar, Sept. 2006 3 Signal-to-Noise ratio !!! In general the reduction of the speech intelligibility is related to a reduction of the signal-to-noise ratio. Hence, speaking louder, increasing the directivity factor of the listener or a well designed public address system might help.In general the reduction of the speech intelligibility is related to a reduction of the signal-to-noise ratio. Hence, speaking louder, increasing the directivity factor of the listener or a well designed public address system might help.

    4. Ecophon Seminar, Sept. 2006 4 Assessment Methods Subjective assessment with subjects (speakers and listeners): representative, limited reproduction, no diagnostics, laborious Objective assessment based on physical properties (measurements): reproducible, diagnostic, fast Objective methods allow prediction of system performance: design tool Assessment of the intelligibility of a speech communication channel can be performed by subjective and objective test methods. Subjective evaluation: By making use of a speaker, representative speech material has to be transmitted through the channel under test (such as sentences or selected test words). A listener at the receiving side has to write down the test words or just score his/her impression (sentences) of the intelligibility. For a representative test at least 4 speakers and 4 listeners should be used. Objective assessment: Objective methods determine various physical properties of the channel under test and predict a score related to the intelligibility.Assessment of the intelligibility of a speech communication channel can be performed by subjective and objective test methods. Subjective evaluation: By making use of a speaker, representative speech material has to be transmitted through the channel under test (such as sentences or selected test words). A listener at the receiving side has to write down the test words or just score his/her impression (sentences) of the intelligibility. For a representative test at least 4 speakers and 4 listeners should be used. Objective assessment: Objective methods determine various physical properties of the channel under test and predict a score related to the intelligibility.

    5. Ecophon Seminar, Sept. 2006 5 Speech and Noise The graph shows the average speech spectrum and the spectrum of a (white) noise. For 7 octave bands the signal-to-noise ratio (SNR) can be determined. The graph shows a positive SNR for frequency bands up to 2 kHz and a negative SNR for the 4 kHz and 8 kHz band. Each SNR value is converted to an index (0-1) such that an SNR of -15 dB relates to an index of “0” and an SNR of +15 dB relates to an index of “1”. As each octave band has a different contribution to intelligibility a weighted summation is applied to derive the final STI. So far the method is valid for speech combined with a stationary noise.The graph shows the average speech spectrum and the spectrum of a (white) noise. For 7 octave bands the signal-to-noise ratio (SNR) can be determined. The graph shows a positive SNR for frequency bands up to 2 kHz and a negative SNR for the 4 kHz and 8 kHz band. Each SNR value is converted to an index (0-1) such that an SNR of -15 dB relates to an index of “0” and an SNR of +15 dB relates to an index of “1”. As each octave band has a different contribution to intelligibility a weighted summation is applied to derive the final STI. So far the method is valid for speech combined with a stationary noise.

    6. Ecophon Seminar, Sept. 2006 6 Speech envelope function and envelope spectrum The envelope function of a speech sample shows how well the speech signal is preserved. Each syllable is represented by a peak in the envelope function. The envelope function is unique for each sequence of words. However if we determine the frequency spectrum of the envelope function we will get a more general description which is a reproducible measure for longer speech tokens (one minute). In the lower graph (B) the envelope spectrum is given for the envelope function (A) of the octave band 250 Hz. The spectrum is referred to the average band level.The envelope function of a speech sample shows how well the speech signal is preserved. Each syllable is represented by a peak in the envelope function. The envelope function is unique for each sequence of words. However if we determine the frequency spectrum of the envelope function we will get a more general description which is a reproducible measure for longer speech tokens (one minute). In the lower graph (B) the envelope spectrum is given for the envelope function (A) of the octave band 250 Hz. The spectrum is referred to the average band level.

    7. Ecophon Seminar, Sept. 2006 7 Spatial line frequency equivalent to MTF Temporal distortions (echoes, reverberation, and automatic gain control) reduce the intelligibility of speech. For example reverberations mask fast fluctuations in the speech signal. This is similar to a poor spatial resolution in a visual display. The spatial line-frequency (number of lines per cm) varies for the vertical lines in the centre of the graph. If a camera is out of focus (or the band-width of a TV-channel is limited) the lines with the highest spatial frequency will merge and a gray area rather than distinct separate lines will be displayed. This degradation as a function of the spatial line frequency is called the Modulation Transfer Function. A similar approach can be used for temporal distortions in room acoustics (see next slide).Temporal distortions (echoes, reverberation, and automatic gain control) reduce the intelligibility of speech. For example reverberations mask fast fluctuations in the speech signal. This is similar to a poor spatial resolution in a visual display. The spatial line-frequency (number of lines per cm) varies for the vertical lines in the centre of the graph. If a camera is out of focus (or the band-width of a TV-channel is limited) the lines with the highest spatial frequency will merge and a gray area rather than distinct separate lines will be displayed. This degradation as a function of the spatial line frequency is called the Modulation Transfer Function. A similar approach can be used for temporal distortions in room acoustics (see next slide).

    8. Ecophon Seminar, Sept. 2006 8 Distortion of Speech Envelope Here we see the effect of two types of distortion on the envelope function and envelope spectrum. The upper graph shows the original envelope function and envelope spectrum. The speech signal in graph A was masked by reverberation. Fast fluctuations are smeared, slow fluctuations remain (compared with the original). The envelope function shows a decrease for the fast fluctuations at higher frequencies. If we subtract the envelope spectrum from the original (differences shown by the vertical lines), we get the MTF for this type of reverberation. The reduction is given by the formulae. The speech signal in graph B was masked by noise, hence the fluctuations in the envelope will not be disturbed but the average band level will be increased (see bottom of envelope function) consequently the envelope spectrum (given with respect to the average band level) will decrease. The difference is independent of the modulation frequency. The reduction is given by the formulae.Here we see the effect of two types of distortion on the envelope function and envelope spectrum. The upper graph shows the original envelope function and envelope spectrum. The speech signal in graph A was masked by reverberation. Fast fluctuations are smeared, slow fluctuations remain (compared with the original). The envelope function shows a decrease for the fast fluctuations at higher frequencies. If we subtract the envelope spectrum from the original (differences shown by the vertical lines), we get the MTF for this type of reverberation. The reduction is given by the formulae. The speech signal in graph B was masked by noise, hence the fluctuations in the envelope will not be disturbed but the average band level will be increased (see bottom of envelope function) consequently the envelope spectrum (given with respect to the average band level) will decrease. The difference is independent of the modulation frequency. The reduction is given by the formulae.

    9. Ecophon Seminar, Sept. 2006 9 Dynamic measurement of SNR For the measurement of the MTF an artificial test signal is used. Some of the reasons are to increase the measurement accuracy, to reduce the measuring time and to obtain diagnostic information. The test signal consists of a modulated noise carrier (intensity modulation 100%, m=1) and modulation frequency F. After transmission through a noisy channel the modulation index will be reduced due to the noise. This reduction is a measure for the SNR. By measurement of m as a function of the modulation frequency F, the MTF will be obtained.For the measurement of the MTF an artificial test signal is used. Some of the reasons are to increase the measurement accuracy, to reduce the measuring time and to obtain diagnostic information. The test signal consists of a modulated noise carrier (intensity modulation 100%, m=1) and modulation frequency F. After transmission through a noisy channel the modulation index will be reduced due to the noise. This reduction is a measure for the SNR. By measurement of m as a function of the modulation frequency F, the MTF will be obtained.

    10. Ecophon Seminar, Sept. 2006 10 Matrix for seven MTF’s For a full STI measurement the MTF has to be determined for 7 octave bands (125 Hz – 8 kHz) and for 14 modulation frequencies (0.63 Hz – 12.5 Hz). Under the orange buttons a sample of a test signal for 1, 3, and 10 Hz is given. If you listen to this signal in a reverberating environment you will notice a decrease of the fluctuations for the higher modulation frequency.For a full STI measurement the MTF has to be determined for 7 octave bands (125 Hz – 8 kHz) and for 14 modulation frequencies (0.63 Hz – 12.5 Hz). Under the orange buttons a sample of a test signal for 1, 3, and 10 Hz is given. If you listen to this signal in a reverberating environment you will notice a decrease of the fluctuations for the higher modulation frequency.

    11. Ecophon Seminar, Sept. 2006 11 CALCULATION of STI The STI is derived from the data given in the matrix of the former graph.The STI is derived from the data given in the matrix of the former graph.

    12. Ecophon Seminar, Sept. 2006 12 Calculation scheme The first step is to correct the measured modulation indices “mkF” for masking by adjacent frequency bands and for the reception threshold. The corrected indices are converted to a corresponding SNR value. This SNR value is limited between -15 dB and +15 dB and then converted to a transmission index (TIk,F). The mean TI (average for each octaveband) is calculated for the 14 modulation frequencies (0.63 – 12.5 Hz). This results in seven Modulation Transfer Indices (MTIk). The STI is obtained by calculation of a weighted combination (ak,ßk) of the seven MTI’s.The first step is to correct the measured modulation indices “mkF” for masking by adjacent frequency bands and for the reception threshold. The corrected indices are converted to a corresponding SNR value. This SNR value is limited between -15 dB and +15 dB and then converted to a transmission index (TIk,F). The mean TI (average for each octaveband) is calculated for the 14 modulation frequencies (0.63 – 12.5 Hz). This results in seven Modulation Transfer Indices (MTIk). The STI is obtained by calculation of a weighted combination (ak,ßk) of the seven MTI’s.

    13. Ecophon Seminar, Sept. 2006 13 Frequency weightings (various studies) The octave weighting function (ak) depends on the type of speech that has to be predicted. For vowels the mid frequency range (500-2000 Hz) is important while for consonants the higher frequencies (2000-4000 Hz) provide a higher contribution to intelligibility. This additive model is used for AI and SII and has been improved (1992) for the present STI algorithm. The octave weighting function (ak) depends on the type of speech that has to be predicted. For vowels the mid frequency range (500-2000 Hz) is important while for consonants the higher frequencies (2000-4000 Hz) provide a higher contribution to intelligibility. This additive model is used for AI and SII and has been improved (1992) for the present STI algorithm.

    14. Ecophon Seminar, Sept. 2006 14 Redundancy with image detection Fifty percent of the car presented in this graph is masked. Nevertheless recognition of the type of the car (Mercedes) is easy. Obviously gaps in the presentation will not make the recognition impossible. This may be due to the continuity of the shape of the car, hence some redundancy is available to improve the recognition in a masking condition. With speech a similar effect exists for the low and higher frequency parts of the speech spectrum. A gap in the frequency transfer does not always reduce the intelligibility. Therefore a revised frequency weighting function was introduced in 1992.Fifty percent of the car presented in this graph is masked. Nevertheless recognition of the type of the car (Mercedes) is easy. Obviously gaps in the presentation will not make the recognition impossible. This may be due to the continuity of the shape of the car, hence some redundancy is available to improve the recognition in a masking condition. With speech a similar effect exists for the low and higher frequency parts of the speech spectrum. A gap in the frequency transfer does not always reduce the intelligibility. Therefore a revised frequency weighting function was introduced in 1992.

    15. Ecophon Seminar, Sept. 2006 15 Frequency weightings (CVC words) The frequency weighting function (a, solid line) is given together with a redundancy correction (ß, dotted line). The functions are different for male and female speech. Notice that the frequency range is different for female speech as no energy is observed at the 125 Hz octave band. The frequency weighting function (a, solid line) is given together with a redundancy correction (ß, dotted line). The functions are different for male and female speech. Notice that the frequency range is different for female speech as no energy is observed at the 125 Hz octave band.

    16. Ecophon Seminar, Sept. 2006 16 Relation Noise and Band-pass limiting This graph gives the relation between the predicted STIr (subscript “r” means redundancy correction which is the standardized procedure) and the CVC-word score. For this assessment 18 different communication channels (combinations of band-pass limiting and four types of noise) were investigated. The small variance around the best fitting curve (s.d. = 4.4% around the third order polynomial) shows the predictive power of STI for this type of distortion. A similar relation is obtained for female speech.This graph gives the relation between the predicted STIr (subscript “r” means redundancy correction which is the standardized procedure) and the CVC-word score. For this assessment 18 different communication channels (combinations of band-pass limiting and four types of noise) were investigated. The small variance around the best fitting curve (s.d. = 4.4% around the third order polynomial) shows the predictive power of STI for this type of distortion. A similar relation is obtained for female speech.

    17. Ecophon Seminar, Sept. 2006 17 Listening test with four subjects The listening tests for the assessment were performed with 4 male and 4 female talkers and two listening panels of 4 listeners. Hence for each gender 4 x 8 = 32 talker-listener pairs were obtained. The picture shows a listening panel in action. The listeners type the test words that they have heard. The responses are automatically processed.The listening tests for the assessment were performed with 4 male and 4 female talkers and two listening panels of 4 listeners. Hence for each gender 4 x 8 = 32 talker-listener pairs were obtained. The picture shows a listening panel in action. The listeners type the test words that they have heard. The responses are automatically processed.

    18. Ecophon Seminar, Sept. 2006 18 The test words consist of a combination of a consonant-vowel-consonant. Lists of 51 words are used. Each list is based on an equally balanced selection of 17 initial consonants, 15 vowels, and 11 final consonants. The test words are embedded in a carrier phrase. Carrier phrases are used: to get the listener’s attention, to control the vocal effect of the talker, and (if required) to induce temporal distortion during the presentation of the test word (echo, reverberation). A few examples are given.The test words consist of a combination of a consonant-vowel-consonant. Lists of 51 words are used. Each list is based on an equally balanced selection of 17 initial consonants, 15 vowels, and 11 final consonants. The test words are embedded in a carrier phrase. Carrier phrases are used: to get the listener’s attention, to control the vocal effect of the talker, and (if required) to induce temporal distortion during the presentation of the test word (echo, reverberation). A few examples are given.

    19. Ecophon Seminar, Sept. 2006 19 Relation AGC and echoes Example of the relation between STI and CVC-word score for temporal distortion (echoes, automatic gain control).Example of the relation between STI and CVC-word score for temporal distortion (echoes, automatic gain control).

    20. Ecophon Seminar, Sept. 2006 20 Qualification of STI (Acustica, 1984) The STI was assessed for various languages in an International Round Robin test (1984, 8 participants). Also a qualification scale was derived from these experiments and five quality intervals were determined. These are represented now in various international standards, ISO9921, IEC 60268-16.The STI was assessed for various languages in an International Round Robin test (1984, 8 participants). Also a qualification scale was derived from these experiments and five quality intervals were determined. These are represented now in various international standards, ISO9921, IEC 60268-16.

    21. Ecophon Seminar, Sept. 2006 21 Performance cabin public address Example of the performance of a Public Address system in a wide body aircraft.Example of the performance of a Public Address system in a wide body aircraft.

    22. Ecophon Seminar, Sept. 2006 22 Iso-STI contours Iso-STI contours for an auditorium with no back ground noise. Notice the loudspeaker next to the speaker. Notice also the three marks, A, B, C that will be discussed in the next slide.Iso-STI contours for an auditorium with no back ground noise. Notice the loudspeaker next to the speaker. Notice also the three marks, A, B, C that will be discussed in the next slide.

    23. Ecophon Seminar, Sept. 2006 23 Effective gain of a PA-system The STI calculation scheme allows to introduce artificial noise (by correcting the measured m-value with respect to the measured test signal level). Hence for a condition measured without background noise the effect of noise (with any spectral shape and level, octave resolution) can be accounted for. In this graph the STI is given for three positions (A, B, C), noise levels up to 70 dB, and no public address (solid line). The conditions with the public address system switched on show a higher performance in noise conditions but also (at position C) a decrease of the STI at low noise levels. Obviously the PA-system introduces additional reverberation or an echo at this position.The STI calculation scheme allows to introduce artificial noise (by correcting the measured m-value with respect to the measured test signal level). Hence for a condition measured without background noise the effect of noise (with any spectral shape and level, octave resolution) can be accounted for. In this graph the STI is given for three positions (A, B, C), noise levels up to 70 dB, and no public address (solid line). The conditions with the public address system switched on show a higher performance in noise conditions but also (at position C) a decrease of the STI at low noise levels. Obviously the PA-system introduces additional reverberation or an echo at this position.

    24. Ecophon Seminar, Sept. 2006 24 Field measurement Example of an STI measurement with speech as test signal. The envelope spectrum is determined for the original speech signal (close to the talker) and after transmission. Based on the difference between the two spectra the modulation transfer is obtained. A comparison with an MTF measurement is also given (direct).Example of an STI measurement with speech as test signal. The envelope spectrum is determined for the original speech signal (close to the talker) and after transmission. Based on the difference between the two spectra the modulation transfer is obtained. A comparison with an MTF measurement is also given (direct).

    25. Ecophon Seminar, Sept. 2006 25 Full STI (STI-14 and limited modulation frequency range STI-3) The full STI requires 7 octave bands, 14 modulation frequencies, and a specific (speech-like) modulation for octave bands not under test. With this full-STI measurement a wide scope of distortions can be assessed accurately (band-pass limiting, noise, non-linear distortion, and temporal distortion). STI-3 offers a limited resolution in the time domain (temporal), however the measuring time is reduced.The full STI requires 7 octave bands, 14 modulation frequencies, and a specific (speech-like) modulation for octave bands not under test. With this full-STI measurement a wide scope of distortions can be assessed accurately (band-pass limiting, noise, non-linear distortion, and temporal distortion). STI-3 offers a limited resolution in the time domain (temporal), however the measuring time is reduced.

    26. Ecophon Seminar, Sept. 2006 26 RASTI (Room Acoustics STI,1979) In the late 70’s a specific version (based on the simple microprocessor 6502) was developed for room acoustics (see ref 7). We called it RASTI (Room Acoustical STI). The reduction to two octaves was valid for person-to-person communications. Hence PA-systems, specific (not contiguous) noise spectra are not specified in the application of RASTI. The present state-of-the-art has made it possible to extend the power of a fast screening system such as STI-PA (2001).In the late 70’s a specific version (based on the simple microprocessor 6502) was developed for room acoustics (see ref 7). We called it RASTI (Room Acoustical STI). The reduction to two octaves was valid for person-to-person communications. Hence PA-systems, specific (not contiguous) noise spectra are not specified in the application of RASTI. The present state-of-the-art has made it possible to extend the power of a fast screening system such as STI-PA (2001).

    27. Ecophon Seminar, Sept. 2006 27 STI-PA (STI Public Address, 2001) STI-PA has a higher resolution in the frequency domain and covers the full range of the MTF (in six frequency bands). Therefore it is more suitable for room acoustical assessments. However, for nonlinear distortion the full STI (14) is advised.STI-PA has a higher resolution in the frequency domain and covers the full range of the MTF (in six frequency bands). Therefore it is more suitable for room acoustical assessments. However, for nonlinear distortion the full STI (14) is advised.

    28. Ecophon Seminar, Sept. 2006 28 Future Binaural STI Improvement using Speech as test signal Non-native talkers and listeners Recent developments include binaural hearing, more advanced measurements with speech as test signal, the correction for the effect of non-native talkers and listeners and the use of STI for vocoders.Recent developments include binaural hearing, more advanced measurements with speech as test signal, the correction for the effect of non-native talkers and listeners and the use of STI for vocoders.

    29. Ecophon Seminar, Sept. 2006 29 Future: Binaural STI (I) Use artificial head and perform simultaneous measurement on both ears Select highest performance (best ear selection) Use cross correlation approach for 500, 1000, and 2000 Hz. Classically binaural effects were not included in STI. A rule of thumb was used (3 dB - 0.1 STI) for improvement by directional hearing. A new method has been proposed (use best ear response for a dummy head, and perform cross-correllogram model, see publication 20, sheet 2.Classically binaural effects were not included in STI. A rule of thumb was used (3 dB - 0.1 STI) for improvement by directional hearing. A new method has been proposed (use best ear response for a dummy head, and perform cross-correllogram model, see publication 20, sheet 2.

    30. Ecophon Seminar, Sept. 2006 30 Future: Binaural STI (II) CVC-word scores (7 subjects) as a function of the binaural STI as well as the monaural STI. Test conditions were selected to be difficult for the standard STI; the conditions include anechoic conditions,(1-14), a cathedral environment (15-21), a classroom (22-32) and a listening room (33-39). The standard “monaural” CVC vs. STI reference curve is also given.CVC-word scores (7 subjects) as a function of the binaural STI as well as the monaural STI. Test conditions were selected to be difficult for the standard STI; the conditions include anechoic conditions,(1-14), a cathedral environment (15-21), a classroom (22-32) and a listening room (33-39). The standard “monaural” CVC vs. STI reference curve is also given.

    31. Ecophon Seminar, Sept. 2006 31 Future: improvement Speech Signal as test signal Using speech as test signal was developed in the past (see 10). Recent developments with improved signal processing technology make it possible to extend the scope of using speech. The graph represents the relation between STI and CVC scores for various vocoders. It is clear that a good relation is obtained, however a correction of STI 0.3 should be subtracted (see reference 21).Using speech as test signal was developed in the past (see 10). Recent developments with improved signal processing technology make it possible to extend the scope of using speech. The graph represents the relation between STI and CVC scores for various vocoders. It is clear that a good relation is obtained, however a correction of STI 0.3 should be subtracted (see reference 21).

    32. Ecophon Seminar, Sept. 2006 32 Future: non-native talkers and listeners The solid line represents the relation between speech intelligibility and STI for native listeners. The dotted line represents a similar relation for non-native listeners. The arrows indicate the conversion procedure of an STI value for a native listener to an STI value related to the same intelligibility but for a non-native listener. (As a rule of thumb, a non-native listener requires a 4 dB better SNR in order to perceive the same intelligibility).The solid line represents the relation between speech intelligibility and STI for native listeners. The dotted line represents a similar relation for non-native listeners. The arrows indicate the conversion procedure of an STI value for a native listener to an STI value related to the same intelligibility but for a non-native listener. (As a rule of thumb, a non-native listener requires a 4 dB better SNR in order to perceive the same intelligibility).

    33. Ecophon Seminar, Sept. 2006 33 Conclusions STI predicts the speech intelligibility for many types of distortion: noise, band-pass limiting, non-linearity's, temporal, vocoding, non-native speech, and binaural hearing. Improvements (measuring methods or scope) of STI will not change the qualification ranges. STI is an international standardized method (ISO9921, IEC 60268-16) and used for many national standards. Review of standards is on ongoing process. In the near future the IEC standard will be reviewed and new developments added.Review of standards is on ongoing process. In the near future the IEC standard will be reviewed and new developments added.

    34. Ecophon Seminar, Sept. 2006 34 Questions? Review of standards is on ongoing process. In the near future the IEC standard will be reviewed and new developments added.Review of standards is on ongoing process. In the near future the IEC standard will be reviewed and new developments added.

More Related