300 likes | 424 Views
Perceptual Wideband Audio Quality Assessments Using PEAQ Christian Schmidmer Opticom GmbH, Erlangen info@opticom.de. Contents. Quality, definitions User expectation Subjective tests Psychoacoustics PEAQ PESQ vs. PEAQ. What is “Quality”?.
E N D
Perceptual Wideband Audio Quality Assessments Using PEAQChristian SchmidmerOpticom GmbH, Erlangeninfo@opticom.de 2nd Workshop on Wideband Speech Quality - June 2005
Contents • Quality, definitions • User expectation • Subjective tests • Psychoacoustics • PEAQ • PESQ vs. PEAQ 2nd Workshop on Wideband Speech Quality - June 2005
What is “Quality”? “Quality is the difference between what we perceive and what we expect.” From habilitation thesis of Prof. Ute Jekosch “…they are used to phones that sound like a phone.” Frank Meier, Infineon Maybe more important: …is for free. 2nd Workshop on Wideband Speech Quality - June 2005
Differences in Perception ofVoice and Audio • Experience, a priori knowledge • Expectation • Cognitive effects • “Error correction” • Different subjective tests require different models 2nd Workshop on Wideband Speech Quality - June 2005
The range of qualities in the subjective test defines the subjective scale! The Problem of Subjective Scales MP3 @ Intermediate Quality: MP3 @ High Quality: 2nd Workshop on Wideband Speech Quality - June 2005
Impairment Grade Excellent Good Fair Poor Bad 5 4 3 2 1 MOS acc. To P.800 • Standardized Listening Test Procedure acc. to ITU-T P.800ff • Absolute Category Rating Test (ACR), no comparison to reference signal (original) • „How good does it sound?“ • 5-point grading scale ‚opinion scale‘ • Averaging over test Subjects: MOS‚Mean Opinion Score‘ • Language dependent! 2nd Workshop on Wideband Speech Quality - June 2005
Subjective Assessment in ITU-R BS.1116 • Standardised assessment procedure for 'small impairments' in audio systems (ITU-R 1994) • Comparison between reference and test signal • Very sensitive to subtle distortions • double-blindtriple-stimulus with hidden reference Original A B coded / original original / coded 2nd Workshop on Wideband Speech Quality - June 2005
Subjective Assessment in ITU-R BS.1116 • Continuous grading scale with “anchors” • “Subjective Difference Grade“ (SDG) • Question: „How different do the files sound“ 2nd Workshop on Wideband Speech Quality - June 2005
Subjective Testing of Intermediate Audio Quality (IAQ) • “MUSHRA” Multi Stimulus Test with Hidden Reference and Anchors • developed by EBU working group B/AIM • targets at IAQ • ITU-R BS.1534 2nd Workshop on Wideband Speech Quality - June 2005
MUSHRA Test • Training of Subjects • subjects can randomly access all types of codecs at similar bitrate • comparison with CD quality reference • two low-pass 'anchors' (7kHz, 3.5kHz) incl. 2nd Workshop on Wideband Speech Quality - June 2005
MUSHRA Test • Scoring Phase • comparison with CD reference, hidden reference inc.. • two low-pass 'anchors' (7kHz, 3.5kHz) inc.. • subjects can randomly assess all codecs under test of similar bitrate at the same time • subjects adjust slider, no score involved • slider mapped to 0..100 2nd Workshop on Wideband Speech Quality - June 2005
Comparison of Subjective Test Methods 2nd Workshop on Wideband Speech Quality - June 2005
60 40 20 0 Temporal Masking Pre- Simultaneous- Postmasking SL [dB] Masker -50 0 50 100 150 0 50 100 150 200 t [ms] • Premasking: 2-5ms • Postmasking: 120ms • Depending on the signal characteristics of the masker 2nd Workshop on Wideband Speech Quality - June 2005
Pitch Scale / Critical Bands A sine tone and a noise of critical bandwidth with the same center frequency and energy density are perceived equally loud. 2nd Workshop on Wideband Speech Quality - June 2005
Threshold in Quiet - Masked Threshold Threshold in Quiet 2nd Workshop on Wideband Speech Quality - June 2005
PEAQ • ITU-R TG 10/4: Call for proposals (1995) • PEAQ is based on: • PAQMKPN Research, Netherlands / OPTICOM • NMRFraunhofer, Germany / OPTICOM • DIX TU Berlin / Deutsche Telekom Berkom • POMCCETT, France • PERCEVALCRC, Canada • "Tool box"IRT, Germany • Jan. 1999 released as ITU-R Rec. BS.1387 2nd Workshop on Wideband Speech Quality - June 2005
Intrusive Testing B Network X Network Y A Comparison with known stimulus: + Very high accuracy + Black box approach – no knowledge of DUT - Requires a reference signal • Generates traffic Alternatively both signals may be captured by the test system! 2nd Workshop on Wideband Speech Quality - June 2005
Two Versions of PEAQ: • PEAQ „Basic“ • computational efficiency • realtime performance • PEAQ „Advanced“ • highest possible accuracy 2nd Workshop on Wideband Speech Quality - June 2005
Structure of a perceptual measurement tool Perceptual Reference a b Model (=sent file) Cognitive Feature- Model Extractor MOS (Quality Measure) Perceptual Test a b Model (=received file) 2nd Workshop on Wideband Speech Quality - June 2005
Perceptual Model, PEAQ “Basic” Listening Level Input Signal (dB SPL) fs=48kHz (fs=44.1kHz) a “Pitch” Internal Noise Outer and Grouping into FFT & Scaling Temporal Masking + Middle Ear Critical Bands • 2048 Punkte Spreading • Forward masking Weighting • ¼ Bark • 42.6ms/23.4Hz 1 2 b Excitation 2nd Workshop on Wideband Speech Quality - June 2005
MOVs used in PEAQ “Basic” Version 2nd Workshop on Wideband Speech Quality - June 2005
Listening Level Input Signal (dB SPL) Temporal Resolution: fs=48kHz (fs=44.1kHz) 0.66ms 4ms “Pitch” Outer and Filterbank Subsampling Spreading and Scaling Middle Ear • 40 auditory bands • 1:6 Backward Masking Filtering • Subsampling 1:32 + Forward Masking 1 Internal Noise Excitation Perceptual Model, PEAQ “Advanced” 2nd Workshop on Wideband Speech Quality - June 2005
PEAQ vs. MUSHRA • EBU Tests of Internet Audio Codecs • Microsoft Windows Media 4 • MPEG-4 AAC (Fraunhofer) • MP3 (Fraunhofer) • Quicktime 4, Music-Codec 2 (Qdesign) • Real Audio 5.0 • RealAudio G2 • MPEG-4 TwinVQ (Yahama) 2nd Workshop on Wideband Speech Quality - June 2005
Constraints of MUSHRA Testing • no absolute scores: • ->scores depend on the test condition • low-pass anchors are only one quality dimension • -> disturbance of artefacts is another one • spreading of the scale from best to worst • -> what about adding new items to an existing test? In order to verify PEAQ performance we must adjust the best and worst item (not the anchors!) 2nd Workshop on Wideband Speech Quality - June 2005
PEAQ vs. MUSHRA (EBU Test) 2nd Workshop on Wideband Speech Quality - June 2005
Results 2nd Workshop on Wideband Speech Quality - June 2005
No! Final Question: • Can I use PESQ instead of PEAQ? • Perception of voice differs from perception of music • PESQ time alignment fails on music • PEAQ and PESQ are modelling different subjective tests 2nd Workshop on Wideband Speech Quality - June 2005
More Information: www.opticom.de OPTICOM Germany info@opticom.de Thank you! 2nd Workshop on Wideband Speech Quality - June 2005