220 likes | 238 Views
Some Aspects of Wideband Speech in Enterprise Telephony. Eric J. Diethorn ( ejd@avaya.com ) with Gary W. Elko ( gwe@avaya.com ) and Joseph L. Hall ( jhall01@avaya.com ) Avaya, Inc. Avaya Labs, Research 233 Mt. Airy Road, Basking Ridge, New Jersey 07920 USA. Outline. Physical acoustics
E N D
Some Aspects of Wideband Speech in Enterprise Telephony Eric J. Diethorn (ejd@avaya.com) with Gary W. Elko (gwe@avaya.com) and Joseph L. Hall (jhall01@avaya.com) Avaya, Inc. Avaya Labs, Research 233 Mt. Airy Road, Basking Ridge, New Jersey 07920 USA 2nd Workshop on Wideband Speech Quality - June 2005
Outline • Physical acoustics • Echo • Voice coders • Conferencing • Wideband speech & intelligibility • Hallway demonstration – Avaya SIP Softphone 2nd Workshop on Wideband Speech Quality - June 2005
Some introductory thoughts • Wideband speech telephony will instantaneously raise the bar of end-user expectation, at least for some applications. • Skype • We have standards for the reproduction of wideband speech, but is wider-band good enough? • Maybe [150, 5000] is good enough? • With greater bandwidth comes a greater range of potential artifacts that the acoustical-signal-processing engineer must address. • Low-frequency acoustic echo, earpiece hiss, speech-coder distortion, arbitration of multiple sampling rates. • The preferences of end users are uncertain. • Speech bandwidths policies (buddy lists, profiles)? • Suppose I have a physiological speech impediment. Do I want it emphasized? 2nd Workshop on Wideband Speech Quality - June 2005
Physical acoustics • The physical design of terminal acoustics must change to render wideband speech. • Acoustical signal processing changes, too. 2nd Workshop on Wideband Speech Quality - June 2005
Loudspeakers & enclosures Frequency response, traditional narrowband speakerphone, 80 dB-SPL50 cm Sound Pressure Level (dB) Frequency (Hz) 2nd Workshop on Wideband Speech Quality - June 2005
Loudspeakers & enclosures Total harmonic distortion, traditional narrowband speakerphone, 80 dB-SPL50 cm • High distortion at low frequency end of wideband-speech spectrum • Acoustic echo control difficult if not impossible without acoustical modifications. THD at harmonics (%) Frequency (Hz) 2nd Workshop on Wideband Speech Quality - June 2005
Earpieces Frequency response, wideband handset Sound Pressure Level (dB) Frequency (Hz) • In order to satisfy wideband standards, acoustical modifications are necessary to extend the low-frequency response of most earpiece designs. • This is particularly challenging for physical arrangements in which the earpiece is held to the ear with little pressure. 2nd Workshop on Wideband Speech Quality - June 2005
Microphones • Most low-cost electret microphones used today have a frequency response that is practically flat beyond the range of wideband speech – they are “wideband ready.” • Multiple microphone arrangements – arrays – can be exploited to reduce the level of ambient noise at frequencies not present in traditional narrowband telephony. • Low-frequency rumble. • High-frequency hiss. • Short-time spectral modification methods of noise reduction can help, but the perception of artifacts from such processing is enhanced by the wider speech band. 2nd Workshop on Wideband Speech Quality - June 2005
Microphones • Omnidirectional microphone (traditional) • Good pick-up of talkers in all directions • But, picks-up ambient noise from all directions Front of phone Front of phone • Directional microphone • Reduces off-axis noise • Reduces reverberation of talker’s voice • Reduces coupling from speakerphone (helping AEC) • But, talkers off axis can’t be heard well. 2nd Workshop on Wideband Speech Quality - June 2005
Echo • Requirements on echo control may change. • The art of echo control must evolve to meet the challenge of wideband speech. 2nd Workshop on Wideband Speech Quality - June 2005
Requirements on Talker Echo • Roundtrip, mouth-to-ear, echo loss requirements were measured on populations for narrowband speech. How well do these data apply to wideband speech echo paths? Source: Transmission Systems for Communications, Bell Telephone Laboratories, Inc., 5th Edition, 1982. Percent Good-or-Better Acoustic-to-acoustic echo-path loss (dB) Echo annoyance as a function of roundtrip, mouth-to-ear loss and delay, for narrowband speech. 2nd Workshop on Wideband Speech Quality - June 2005
Talker Echo, Continued • Being strictly digital, wideband-speech network paths do not suffer from analog circuit noises, however, analog and environmental noises enter calls at the endpoint. Should requirements on talker echo incorporate such (wideband) noise phenomena? Source: Transmission Systems for Communications, Bell Telephone Laboratories, Inc., 5th Edition, 1982. Echo annoyance as a function of roundtrip, mouth-to-ear echo-and-noise loss. Long-haul (~1000 mi.) PSTN connection, circa 1980. 2nd Workshop on Wideband Speech Quality - June 2005
Wideband speech coding • G.722, G.722.1 and G.722.2 • G.722 is cheap. • G.722.1 often comes with video-on-the-enterprise (Polycom). • Proprietary codecs • Silicon solution providers have their favorites. Some are pretty good. • Linear 16-bit encoding? • Speech-transmission bandwidth (bits-per-second) is becoming a non-issue in the enterprise, at least for wired LANs. • Architecturally appealing within the enterprise. Let boundary gateways worry about transcoding. 2nd Workshop on Wideband Speech Quality - June 2005
. . . . . . Multirate audio conferencing • Rate arbitration • Transcoding • Multirate mixing • (Artificial) bandwidth extension Conference bridge server Wide- and narrow-band speech IP-1 PSTN narrowband speech Leased WAN (compressed speech, e.g., G.729, G.726) IP-2 2nd Workshop on Wideband Speech Quality - June 2005
NL NL g2 ~ h1 ~ h2 h2 Stereo audio conferencing Hands-free, wideband-speech communications with stereo echo cancellation echo g1 h1 talker ROOM 1 - ROOM 2 + 2nd Workshop on Wideband Speech Quality - June 2005
Stereo Conferencing (Placeholder, video demonstration) 2nd Workshop on Wideband Speech Quality - June 2005
Wideband speech & intelligibility • Siemens – “…wideband transmissions can reduce speech ambiguities by as much as 90 percent, increasing conversational intelligibility and reducing listener fatigue.” (2003 press release) • Polycom – “For single syllables, 3.3 kHz bandwidth yields an accuracy of only 75 percent, as opposed to over 95 percent with 7 kHz bandwidth.” (2003 white paper) • Marketing vs. science – both required 2nd Workshop on Wideband Speech Quality - June 2005
Experimental study* • Similar to Diagnostic Rhyme Test and Diagnostic Alliteration Test , except we generated our own word pairs • e. g., “tie” & “pie” (“hot” & “hop”) • Subject hears one of the two, is shown both, is asked “Which of these two did you hear?” • Clean anechoic speech filtered to 3 bandwidths • [50,3300], [50,5000] and [50,7000] Hz. • Investigate all nine combinations of three bandwidths and three additive-noise levels (0 dB, +12 dB, +24 dB SNR). • Reference: G.A. Miller and P.E. Nicely, “An analysis of perceptual confusions among some English consonants” Lincoln Laboratory, MIT, 1955 (J. Acoust. Soc. Amer. Vol. 27, pp. 338-352) * For questions concerning aspects of this study, contact Joseph L. Hall, Avaya Research, jhall01@avaya.com 2nd Workshop on Wideband Speech Quality - June 2005
What do they sound like? “Seed, feed, seed” at different bandwidths and additive noise levels. 2nd Workshop on Wideband Speech Quality - June 2005
Representative results 2nd Workshop on Wideband Speech Quality - June 2005
Summary of results 2nd Workshop on Wideband Speech Quality - June 2005
Hallway Demonstration -- Avaya widebandSIP softphone • Wideband speech (16 kHz sampling, bandwidth limited by PC sound architecture). • Voice codecs • G.711, G.729, G.726 • G.722 2nd Workshop on Wideband Speech Quality - June 2005