240 likes | 310 Views
The study delves into the importance of various parameters in identifying speaker identity using voice transformation functions. It details system description, parameter extraction, voice conversion, synthesis, analysis, and application to voice quality tasks with specific results and conclusions.
E N D
Analysis of Parameter Importance in Speaker Identity Ricardo de Córdoba, Juana M. Gutiérrez-Arriola Speech Technology Group Departamento de Ingeniería ElectrónicaUniversidad Politécnica de Madride-mail: cordoba@die.upm.es,jmga@ics.upm.es
Index • Introduction • System description • Parameter extraction • Voice conversion and synthesis • Parameter analysis • Application to a voice quality task • Results • Conclusions
— source Analysis parameters speaker voice Transformation functionscomputation transformation functions — target Analysis parameters speaker voice — Voice target speaker converted Synthesis parameters speech conversion Introduction
Index • Introduction • System description • Parameter extraction • Voice conversion and synthesis • Parameter analysis • Application to a voice quality task • Results • Conclusions
Index • Introduction • System description • Parameter extraction • Voice conversion and synthesis • Parameter analysis • Application to a voice quality task • Results • Conclusions
Parameter Extraction I • Glottal parameters:
Parameter Extraction IV • We calculate F0, AV, AF, formant frequencies and bandwidths • Pitch marks and formants are manually revised • Only voiced sounds are transformed
Index • Introduction • System description • Parameter extraction • Voice conversion and synthesis • Parameter analysis • Application to a voice quality task • Results • Conclusions
Voice conversion I • Lineal transformation functions: • For each pair of source-target units we compute the transformation coefficients which are stored in a file
Synthesis • Formant synthesizer (Klatt) • Parameterized units concatenation • Prosodic modification, changing glottal pulse length and the number of glottal pulses • Formant smoothing during unit transitions
Index • Introduction • System description • Parameter extraction • Voice conversion and synthesis • Parameter analysis • Application to a voice quality task • Results • Conclusions
Parameter Analysis I • 11 speakers (5 female, 6 male) • EUROM1 database in Castilian Spanish • Sentence: “Mi abuelo me animó a estudiar solfeo”(My grandfather encouraged me to study solfa) • Fs=16kHz
Parameter Analysis III • We want to know which parameters are actually relevant for speaker identity • Discriminant functions are linear combinations of variables that best discriminate classes • They can be used to rank the variables in terms of their relative contribution to class discrimination • LDA is performed: • For each phoneme of the sentence (does not work well for the whole sentence) • Coefficients of the first discriminant function are used to rank the parameters
Application to a Voice Quality Task • We extracted four sentences of the Brian VOQUAL'03 database: normal, clear, creaky, and relax. • We analyzed two phonemes of the sentence: “She has left for a great party today” • We wanted to rank parameter importance to discriminate between the four classes: • We use the coefficients of the first discriminant function
Index • Introduction • System description • Parameter extraction • Voice conversion and synthesis • Parameter analysis • Application to a voice quality task • Results • Conclusions
Results IVoice Quality Task E A Frame classification for E and A using LDA for the first two discriminant functions normalcreakyclearrelax
Results IIVoice Quality Task Absolute values of the coefficients that multiply each parameter in the first discriminant functions E A First function coefficients
Results IIISpeaker Identity Number of times each parameter has been the most relevant (up) and the least relevant (bottom) in the first discriminant function
Index • Introduction • System description • Parameter extraction • Voice conversion and synthesis • Parameter analysis • Application to a voice quality task • Results • Conclusions
Conclusions • Parameter importance depends on: • the type of speech • the gender of the speaker • the phonemes under study • Results show that F0, formant frequencies and OQ are the most important parameters for speaker classification.