240 likes | 257 Views
Study analyzing relevant parameters for speaker identity through voice conversion and synthesis, applied to voice quality tasks with conclusions on parameter importance.
E N D
Analysis of Parameter Importance in Speaker Identity Ricardo de Córdoba, Juana M. Gutiérrez-Arriola Speech Technology Group Departamento de Ingeniería ElectrónicaUniversidad Politécnica de Madride-mail: cordoba@die.upm.es,jmga@ics.upm.es
Index • Introduction • System description • Parameter extraction • Voice conversion and synthesis • Parameter analysis • Application to a voice quality task • Results • Conclusions
— source Analysis parameters speaker voice Transformation functionscomputation transformation functions — target Analysis parameters speaker voice — Voice target speaker converted Synthesis parameters speech conversion Introduction
Index • Introduction • System description • Parameter extraction • Voice conversion and synthesis • Parameter analysis • Application to a voice quality task • Results • Conclusions
Index • Introduction • System description • Parameter extraction • Voice conversion and synthesis • Parameter analysis • Application to a voice quality task • Results • Conclusions
Parameter Extraction I • Glottal parameters:
Parameter Extraction IV • We calculate F0, AV, AF, formant frequencies and bandwidths • Pitch marks and formants are manually revised • Only voiced sounds are transformed
Index • Introduction • System description • Parameter extraction • Voice conversion and synthesis • Parameter analysis • Application to a voice quality task • Results • Conclusions
Voice conversion I • Lineal transformation functions: • For each pair of source-target units we compute the transformation coefficients which are stored in a file
Synthesis • Formant synthesizer (Klatt) • Parameterized units concatenation • Prosodic modification, changing glottal pulse length and the number of glottal pulses • Formant smoothing during unit transitions
Index • Introduction • System description • Parameter extraction • Voice conversion and synthesis • Parameter analysis • Application to a voice quality task • Results • Conclusions
Parameter Analysis I • 11 speakers (5 female, 6 male) • EUROM1 database in Castilian Spanish • Sentence: “Mi abuelo me animó a estudiar solfeo”(My grandfather encouraged me to study solfa) • Fs=16kHz
Parameter Analysis III • We want to know which parameters are actually relevant for speaker identity • Discriminant functions are linear combinations of variables that best discriminate classes • They can be used to rank the variables in terms of their relative contribution to class discrimination • LDA is performed: • For each phoneme of the sentence (does not work well for the whole sentence) • Coefficients of the first discriminant function are used to rank the parameters
Application to a Voice Quality Task • We extracted four sentences of the Brian VOQUAL'03 database: normal, clear, creaky, and relax. • We analyzed two phonemes of the sentence: “She has left for a great party today” • We wanted to rank parameter importance to discriminate between the four classes: • We use the coefficients of the first discriminant function
Index • Introduction • System description • Parameter extraction • Voice conversion and synthesis • Parameter analysis • Application to a voice quality task • Results • Conclusions
Results IVoice Quality Task E A Frame classification for E and A using LDA for the first two discriminant functions normalcreakyclearrelax
Results IIVoice Quality Task Absolute values of the coefficients that multiply each parameter in the first discriminant functions E A First function coefficients
Results IIISpeaker Identity Number of times each parameter has been the most relevant (up) and the least relevant (bottom) in the first discriminant function
Index • Introduction • System description • Parameter extraction • Voice conversion and synthesis • Parameter analysis • Application to a voice quality task • Results • Conclusions
Conclusions • Parameter importance depends on: • the type of speech • the gender of the speaker • the phonemes under study • Results show that F0, formant frequencies and OQ are the most important parameters for speaker classification.