Analysis of Parameter Importance in Speaker Identity

Analysis of Parameter Importance in Speaker Identity Ricardo de Córdoba, Juana M. Gutiérrez-Arriola Speech Technology Group Departamento de Ingeniería ElectrónicaUniversidad Politécnica de Madride-mail: cordoba@die.upm.es,jmga@ics.upm.es

Index • Introduction • System description • Parameter extraction • Voice conversion and synthesis • Parameter analysis • Application to a voice quality task • Results • Conclusions

— source Analysis parameters speaker voice Transformation functionscomputation transformation functions — target Analysis parameters speaker voice — Voice target speaker converted Synthesis parameters speech conversion Introduction

System description

Parameter Extraction I • Glottal parameters:

Parameter extraction II

Parameter Extraction III

Parameter Extraction IV • We calculate F0, AV, AF, formant frequencies and bandwidths • Pitch marks and formants are manually revised • Only voiced sounds are transformed

Voice conversion I • Lineal transformation functions: • For each pair of source-target units we compute the transformation coefficients which are stored in a file

Synthesis • Formant synthesizer (Klatt) • Parameterized units concatenation • Prosodic modification, changing glottal pulse length and the number of glottal pulses • Formant smoothing during unit transitions

Parameter Analysis I • 11 speakers (5 female, 6 male) • EUROM1 database in Castilian Spanish • Sentence: “Mi abuelo me animó a estudiar solfeo”(My grandfather encouraged me to study solfa) • Fs=16kHz

Parameter Analysis II

Parameter Analysis III • We want to know which parameters are actually relevant for speaker identity • Discriminant functions are linear combinations of variables that best discriminate classes • They can be used to rank the variables in terms of their relative contribution to class discrimination • LDA is performed: • For each phoneme of the sentence (does not work well for the whole sentence) • Coefficients of the first discriminant function are used to rank the parameters

Application to a Voice Quality Task • We extracted four sentences of the Brian VOQUAL'03 database: normal, clear, creaky, and relax. • We analyzed two phonemes of the sentence: “She has left for a great party today” • We wanted to rank parameter importance to discriminate between the four classes: • We use the coefficients of the first discriminant function

Results IVoice Quality Task E A Frame classification for E and A using LDA for the first two discriminant functions normalcreakyclearrelax

Results IIVoice Quality Task Absolute values of the coefficients that multiply each parameter in the first discriminant functions E A First function coefficients

Results IIISpeaker Identity Number of times each parameter has been the most relevant (up) and the least relevant (bottom) in the first discriminant function

Conclusions • Parameter importance depends on: • the type of speech • the gender of the speaker • the phonemes under study • Results show that F0, formant frequencies and OQ are the most important parameters for speaker classification.

Analysis of Parameter Importance in Speaker Identity