140 likes | 152 Views
This workshop discusses the history, current work, and future aspects of speech quality measurements in ITU-T. It focuses on listening quality, wideband speech quality, acoustical interfaces, and audio-visual aspects.
E N D
Future work on objective speech quality measurements in ITU-T Jens Berger SwissQual AG, Switzerland jens.berger@swissqual.com Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 8th and 9th June 2004 - Mainz, Germany
Structure • History in ITU-T / CCITT • Current Work – Listening Quality • New Aspects in the Coming Study Period of ITU-T
History in ITU-T / CCITT Before 1990: • ‘classical’ measurements of transmission / terminal characteristics • early spectral- / cepstral-based measures 1996: P.861 ‘PSQM’ 1st psychoacoustic based measure for predicting listening Quality Only ‚Core-Model‘ (no gain-/time-alignment) 2001: P.862 ‘PESQ’ Replaced P.861 Complete method incl. time-/gain-alignement 2004: P.563 1st single ended model for LQ assessments Still Open Listening Quality for Wideband speech Listening Quality at Acoustical Interfaces
Current Focus – Listening Quality Main progression within the last few years: Acceptance of psycho-acoustic based measures modelling results of auditory tests on subjective scales but • Restriction to Listening Quality • Only applicable on narrow-band voice telephony services • Serving only ‘electrical’ interfaces
Work program starting 2005 Ongoing activities of former question 9/12 • Wide-band speech (listening) quality • Listening quality at acoustical interfaces and extensions to • Audio- and noise-signals over telephone channels • Audio-visual aspects • Talking quality and relations to conversational quality but • Restriction to perceptual based models only
Work program starting 2005 Principle concept: Objective measures model specific subjective (auditory) test scenarios and predict their results on a similar scale. Consequence: A subjective test scenario has to be defined and established before the corresponding objective model can be developed. Experience: The better the auditory tests are defined and the stronger their requirements are the more accurate the predicted results will be.
Wide-band speech scenarios Current status: • Proposal from BT and KPN to extend P.862(COM-12/D180, Feb. 2001) • Several studies from NTT in 2004 analyze pro’s and con’s of this proposal • Awaiting advanced solutions for initial meeting in 2005 Open points: • Corresponding auditory tests (pure wide-band, mixed with narrow band, which ratio?, which terminal(s)?) • What about ‘half-wideband’ (e.g. 200-5000Hz)? • Wider influence on terminal characteristics. Is wide-band speech analysis useful without acoustical interfaces?
Acoustical interfaces Current status: • Work item P.AAM currently stopped because disagreement on test scenarios to be served by the approach. • Awaiting precise definitions about the scope and the expected applications of the model Open points: • Corresponding auditory tests for noise at listener side • Handling of handsfree in auditory tests (pure handsfree or mixed with handset) • Handling of different terminals (mix of monotic and diotic, influence of loss of acoustical coupling to real environment?)
Talking Quality Current status: • Proposal from KPN for an perceptual based model (COM-12/10, Nov. 2000; COM-12/D089, Jan. 2003) • Open points: • The subjective test procedure is neither defined nor established • Current proposal does not consider real terminals • Extension to the acoustical interface is needed
Audio- and noise-signals Current status: • New action points First steps: Music over telephone channels: • Applicability of ITU-R Recommendations? • Modifications on speech quality approaches sufficient? First steps: Noise handling / Noise reduction • Starting point: P.835 describes the subjective test • Impact on clean speech could be handled by P.862 / P.AAM • Improvement / degradation of noisy speech could be handled by framing of listening quality models • Quality of residual noise is an open point, an ‘Annoyance model’ is needed
Audio-visual quality assessment Current status: • New action point • Audio-visual quality aspects in telecommunication scenarios (lower bitrate, small image sizes) are interesting for SG12 • Initiating contribution (COM-12/D180, Feb. 2001) • Cooperation with SG9 and VQEG • Joint Rapporteur’s Group on Multi-Media Quality Assessment is already established • Drafting of testplans
Structuring of work To be discussed in ITU-T: • Is a Recommendation / Appendix for wideband speech assessment at electrical interfaces a migration step to a more complete solution only? • How should the Recommendation(s) for measurements at the acoustical interfaces be structured? • Inclusion of wide-band? • Separation of handsfree (separate model or only ‘switch’)? • Handling of binaural signals compared to classical handset? • Two / four separate models corresponding to the combinations of interfaces ? • electrical – electrical • acoustical – electrical • electrical – acoustical • acoustical – acoustical