1 / 15

Formant-Pattern Estimation Guided by Cepstral Compatibility

Formant-Pattern Estimation Guided by Cepstral Compatibility. Frantz Clermont Philip Harrison Peter French J.P. French Associates & University of York United Kingdom. IAFPA Conference Plymouth, UK 23-25 July 2007. Central to Acoustic Phonetics Crucial to Forensic Phonetics

bernad
Download Presentation

Formant-Pattern Estimation Guided by Cepstral Compatibility

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Formant-Pattern Estimation Guided by Cepstral Compatibility Frantz Clermont Philip Harrison Peter French J.P. French Associates & University of York United Kingdom IAFPA Conference Plymouth, UK 23-25 July 2007

  2. Central to Acoustic Phonetics • Crucial to Forensic Phonetics • Central to a Major Debate in Speech & Speaker Recognition • SHORT ANSWER: • STILL NO OBJECTIVE WAY OF ENSURING/CHECKING RELIABILITY OF MEASURED F-PATTERNS • APPRECIABLE VARIABILITY AMONGST SOFTWARE PACKAGES (HARRISON, 2004) • WAYS FORWARD – WHAT ARE WE TO DO? • THROW OUR HANDS IN DESPAIR & SIMPLY USE AVAILABLE TOOLS? • STILL HOPE FOR SOME VIABLE SOLUTIONS? A Long-Standing Problem: How “reliable” is the measured formant-pattern?

  3. A New Approach: Particulars & Aims • Objectivity & Reliability of F-pattern Estimation • Compatibility with the observed spectrum • A “Smart” Measure of Compatibility • We use a related representation: The Cepstrum • Efficient approximation to the exact spectrum • More readily available from speech signal • Contains vocal-tract resonance information • Robustness in speech and speaker recognition • We propose a Cepstral Analysis-by-Synthesis Method • To generate Candidate Cepstra • To determine most compatible candidate w.r.t observed cepstrum

  4. Linear Prediction (LP) ALL-POLE FILTER FILTER ORDER M Pathway to Speech Spectrum Source-Filter MODEL N.B. “OPTIMUM” M UNRESOLVED ISSUE!

  5. EXACT RAW EXACT LP Pathways to Formant Patterns:Prominences of Spectral Shapes

  6. A Vocal-Tract Parameter A Fourier-Series Model of EXACT LP-Spectrum Pathways to Formant Patterns (cont’d):LP-derived Cepstrum (order M) M÷2 POLES {BROAD & NARROW BANDWIDTHS}

  7. The Cepstral Distance (Euclidean):Un-Weighted versus Index-Weighted

  8. ACOUSTICSIGNAL  FRAMES  OBSERVEDCEPSTRUM CEPSTRAL DISTANCE LP CEPSTRUM order M = 12 CANDIDATE CEPSTRA LP CEPSTRUM TO POLES M÷2 = 6 GENERATE ALL UNIQUE QUARTETS  BINOMIAL (6,4) CEPSTRUMCONVERSION Cepstral Analysis-by-Synthesis (CAbS)

  9. MICROPHONE TELEPHONE Some Vowel Data to Test the Approach Harrison (2004) THIS STUDY (1 speaker) • 2 adult-male, native speakers • 3 contemporaneous tokens

  10. 4.5 Candidate Values of Upper Bound (kHz) 10 11 18 2.6 2.5 Candidate Values of M POLE-GRAM (LP-ORDER M = 10) + CAbS RESULTS SUPERIMPOSED (Upper Bound = 3.5 KHz) Implementation Methodology CALCULATEAVERAGE INTRA-VOWEL DISPERSION CLUSTERING QUALITY (CQ)

  11. Opt. M = 13 [3300 – 4200] Hz CQ = 27.05 Hz MICROPHONE TELEPHONE Opt. M = 10 [3300 – 3700] Hz CQ = 27.80 Hz Results 1: Optimisation of Analysis Parameters (LP order & Spectral Range)

  12. Results 2a (MICROPHONE DATA):Min. Distance vs Min. Mean Bandwidth

  13. Results 2b (TELEPHONE DATA):Min. Distance vs Min. Mean Bandwidth

  14. Concluding Summary • Introduced a New Approach to F-pattern Estimation • Two major schools of parameterisation – the formant & the cepstrum • F-patterns estimated by requiring compatibility w/ observed cepstra • Compatibility is achieved using a “smart” measure of compatibility • sensitivity to spectral peaks that provide the best spectral match • flexibility to select any spectral range • Current Results are Promising • Compatibility Measure exhibits Robustness • Estimated F-patterns are Consistent with Phonetic Expectationz • LP-Order and Spectral Range are Concurrently Optimised • Proposed Approach – A Serious First Step towards • Objectivity • Reliability • A Very Useful Tool

  15. Looking towards the Future • More Challenging Data • Beyond Steady-State Vowels • Wide Range of Speakers • Variety of Voice Qualities • Differing Recording Conditions • Speculations • The All-Pole LP-Model is likely to be insufficient for all cases • The Approach of Cepstral Compatibility holds the potential of a long survival!

More Related