330 likes | 520 Views
The Utilization of Subjective Evaluation in the Development of Vocoders. Evaluation Basics. Purpose Research Vocoder Development Vocoder Characterization Selection Validation Types of Conditions of Interest Baseline Acoustic Background Noise Transmission Channel Impairments
E N D
The Utilization of Subjective Evaluation in the Development of Vocoders
Evaluation Basics • Purpose • Research • Vocoder Development • Vocoder Characterization • Selection • Validation • Types of Conditions of Interest • Baseline • Acoustic Background Noise • Transmission Channel Impairments • Talker Variability • Signal Levels • System Tandems • Digital Circuit Multiplication Systems The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Subjective Testing - Control of Variables • Laboratory Factors • Listening Environment; Audio & Electronics • Source/Processed Recording Factors • Speech Material Factors • Linguistic and Phonetic • Talker Factors • Transducer Selection • Audio and Sampled Bandwidth Factors • Acoustic Noise Material and Speech + Noise Method • Listener Factors • Presentation Factors • Blocking, Order and Balance • Audio Level and Sidetone The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Associated Issues • User Population and Face Validity • Context • Range of Candidate Systems • Reference and Calibration Systems • Listen Only vs. Two-Way Methods • Delay • Asymmetric Transmission Channels • VoIP • Speech material • Speech Sample length re impairment distribution • Uniqueness, Amount Available • Type The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Associated Issues (cont.) • Speech material (by increasing contextual content) • Types • Scripted • Sounds • Words • rhyming, CVC, etc • Sentences • meaningful, nonsense, semantically anomalous, etc • Connected sentences • Scripts • Scenario based • Representative of application? • Informational or Familiar • Information flow (balanced?, directional?) • Task Based • Open The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Performance Characteristics & Test Methodology • Quality • Diagnostic Acceptability Measure - DAM • Voiers ICASSP77 • Category Rating Tests - ACR (MOS); DCR (DMOS) CCR (CMOS) • ITU-T P.800: P.830 • ITU HANDBOOK ON TELEPHONOMETRY • IEEE Recommended Practices for Speech Quality Measures 1969 • Paired Comparison A/B Tests • David, H.A, “The Method of Paired Comparison,” Oxford • Multi Stimulus Test with Hidden Reference and Anchor - MUSHRA • ITU-R BS.1534-1 • Speech Communication Systems with Noise Suppression Algorithms • ITU-T P.NSA The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Performance Characteristics & Test Methodology • Speaker Recognizability • NRL Speaker Recognition Test (speakers unknown) • Schmidt-Nielsen SCW95, ICASSP96, JASA 1985 • TNO Speaker Recognition Test (speakers known) • Steeneken & Leeuwen 1997 • Language Dependency • SRT-LD • Wijngaarden SCW02, EuroSpeech01, Ph.D. Dissertation 2003 • Conservation of Stress State Characteristics The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Performance Characteristics & Test Methodology • Communicability • Conversation Opinion Tests • ITU-T P.800 • Conversational & Third Party Listen Only Tests • ITU-T P.832, P-581 (HATS) • Continuous Quality Evaluation Method - ECQ • ITU-T P.PAC • Arcon Communicability Exercise - ACE • Tardelli ICASSP96, NAS-NRC CHABA Symposium 1995 • TNO Communicability Test • Wijngaarden EuroSpeech01 The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Performance Characteristics & Test Methodology • Intelligibility • Modified Rhyme Test - MRT • ANSI S3.2-1989; House 1965; Kruel 1968 • Diagnostic Rhyme Test - DRT • ANSI S3.2-1989; Voiers 1973, 1987 • Consonant-Vowel-Consonant Test - CVC (AI Basis) • Fletcher ATT 1920s, JASA 1950; Allen 1994, ICASSP02; Steeneken 1992 • Speech Reception Threshold - SRT • Plomp & Mimpen 1979; Wijngaarden & Steeneken EuroSpeech99 • International Civil Aviation Org. Spelling Alphabet - ICAO • Moser & Dreher 1955; Schmidt-Nielson NRL R9035 1987, R9174 1988 • INTELTRANS -(CVC, HATS) • CELAR France MOD; J.C. Lafon 1958, 1964, 1968 The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Intelligibility Measures vs. Information Webster, 1979 ANSI S3.5-1969 The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Evaluation Decisions • Purpose • Types of Conditions • Performance Characteristics of Importance • Choice of Test Methodologies • Development of Test Plan • Selection Criteria if Selection Test The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Vocoder Development Issues • Application • Commercial • Strategic • Tactical • Diagnostic Information • Intelligibility • Quality • Communicability The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Low-Rate Vocoder for Tactical Use • Harsh Acoustic Noise Environments • Physical and Jamming Channel Issues • LPI / LPD • Intelligibility • Talker Recognizability • Conserve Stress State of Talker • Audio Bandwidth • Delay • Size - Weight -Power The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Narrowband Low-Rate Vocoder Intelligibility Intelligibility results for current low-rate military vocoders in acoustic background noise The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Effects of Current Noise Preprocessors Intelligibility - DRT Quality - DAM The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Road Map to Improved DRT Intelligibility • Inherent Distinctive Features • Jacobson, Fant, and Halle 1952; Miller & Nicely, 1955 • DRT Attributes • Voiers 1973, 1987 • DRT Attributes : Distinctive Features :Acoustic Correlates • Voiers, Benchmark Papers in Acoustics, V11 1977 • Diagnostic Capabilities of the DRT • Cook Book The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Inherent Distinctive Features (Jacobson, Fant, and Halle 1952) • Fundamental Source Features • Vocalic Non-Vocalic • Consonantal Non-Consonantal • Secondary Consonant Features • Envelope Features • Continuant Interrupted • Checked Unchecked • Strident Mellow • Supplementary Source • Voiced Voiceless • Resonant Features • Compact Diffuse • Tonality Features • Grave Acute • Flat Plain • Sharp Plain • Tense Lax • Supplementary Resonator • Nasal Oral The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
DRT Attributes The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
DRT Attributes : Distinctive Features : Acoustic Correlates DRT Attributes JFH Distinctive Features Voicing Voiced/Voiceless harmonic content, energy at concentration at LF, long duration, low peak power Nasality Nasal/Oral nasal formants in regions of 200, 800 and 2400 Hz Sustention Continuant/Interrupted gradual onset > 130 msec, low level noise in MF to HF Sibilation Strident/Mellow sustained HF noise of relatively high intensity Compactness Compact/Diffuse LF spectral shape, low loci of 2nd and 3rd formants, dynamics of formant transitions Graveness Grave/Acute HF spectral shape, separation of 2nd and 3rd formants, dynamics of 2nd and 3rd formant The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Diagnostic Capabilities of the DRT • Talkers • Male : Female • Attribute State • Present : Absent • Attribute Bias • Sub-Attribute Scores • Characteristic Attribute Profile • Empirical Studies The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Cook Book for Improved Intelligibility The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Pitfalls in Subjective Evaluation • Measured Intelligibility vs. Real World Intelligibility • NAS-NRC CHABA 1989 Symposium Removal of Noise From Noise-Degraded Speech Signals • Vocoder Tuned to DRT Words • Vocoder based on “scripted word” characteristics that are not applicable to conversational speech. • Danger of "self evaluation" by Vocoder Developers • Tardelli, ICASSP96, DAM vs MOS Study 1996 The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
DAM vs. MOS Study A Systematic Investigation of the Mean Opinion Score (MOS) and the Diagnostic Acceptability Measure (DAM) for Use in the Selection of Digital Speech Compression Algorithms ARCON Corp. 1996 Available in DRAFT form at http://www.arcon.com/dld.html The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
P.NSA and WHY • ETSI/3GPP AMR-NS 1999 • Exp.. 3 MMOS w/ Multi-Dimensional Question You will hear speech samples reproduced in a telephone handset. Every sample consists of four short unconnected sentences in a noise environment. Your task is to indicate your opinion of the overall sound quality with respect to any unnatural sound in the sample. Please make your judgement of the sample considering unnatural sound during the complete sample. • Resulted in Bimodal Decision P.NSA Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm Summary This document proposes a methodology for evaluating the subjective quality of speech in noise and particularly appropriate for the evaluation of noise suppression algorithms. The proposed methodology uses separate rating scales to independently estimate the subjective quality of the Speech Signal alone, the Background Noise alone, and Overall Quality. ITU-T SG12/Q7 SQEG, Primarily Dynastat and FT The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
INTELTRANS Testbed The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
DRT Characteristic Attribute Profile The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Empirical Study of DRT Attributes vs. SNR Band Limited Gaussian Noise Voiers, JASA 1973 The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Scripted Material - DRT Word Lists MOOT or BOOT Voicing SHEET or CHEAT Nasality JAB or GAB Sustention POT or TOT Sibilation GHOST or BOAST Graveness DINT or TINT Compactness The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Scripted Material - CVC Nonsense Words MIG(RAINE) COS(T) HAYM DIT TOUP(EE) BACH POD(IUM) SEM(I) LAL:PAL REAS(ON) REET:BEET SAYZ:DAYS BOD(Y) KOOM LEP(ER) PONE:BONE HIES DACK:BACK TEEG:LEAGUE MAHL The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Problems with CVC Test Implementation • CVC Corpus Balance • Talker by Word by Environment • Word by Distinctive Feature by Lexicon • Regional Dialectic Differences • New England • Spoken “COT” = “CAUGHT” • Perception Midwest “CART” = “COT” • Test Design • Uniqueness for Talker By Word by Environment by Process • Balance Across Distinctive Feature by Process • Balance Across Subject by Stimulus • Sufficient Subjects for Reasonable Resolution The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Diagnostic Capabilities of INTELTRANS The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Diagnostic Capabilities of INTELTRANS (cont.) The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003