Speaker Verification System using SVM

Speaker Verification System using SVM Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering

Outline – Summary of Ph.d Dissertation of Vincent Wan • Speaker verification system • Extracting features • Creating models of speakers • Generative models, discriminative models • Making generative models discriminative • Developing speaker verification using SVMs • My interest to improve our system.

Speaker verification system • Authenticate a person’s claimed identity • Text dependent and independent • The system models the sound of the client’s voice. (based on physical characteristics of the client’s vocal tract.) • Feature extraction • Enrolment • Creates a model for client’s voice • Pattern matching • Decision theory A generic speaker verification system

Extracting features • Building models of speakers depends on frequency analysis of the speaker’s voice. • Linear predictive coding (LPC) • LPC assumes that speech can be modelled as the output of periodic pulses or random noise. • The solutions for these LPC coefficients is obtained by minimizing MSE. • Perceptual linear prediction (PLP) • PLP combines LPC analysis with psychophysics knowledge of the human auditory system. • Ex: Human ear has a higher frequency resolution at low frequencies.

Creating models of speakers • Generative models • Gaussian Mixture Model (GMM), Hidden Markov Model (HMM) • Models are probability density estimators that attempt to capture all of the fluctuations and variations of the data. • Discriminative models • Polynomial classifiers, Support Vector Machines (SVM) • Models are optimized to minimize the error on a set of training samples. • Models draw the boundary between classes and ignores the fluctuations within each class. • Generative models discriminative • Generative models use to estimate the within class probability densities and do not minimize a classification error. • Discriminative models achieves the highest performance in classification tasks.

Making generative models discriminative • GMM-LR/SVM combination • GMM likelihood ratio • Bengio proposed that the probability estimates are not perfect and a better version would be • Bayes decision rule • The input to the SVM is the two dimensional vector made up of the log likelihoods of the client and world models. • A limitation of these approaches arises from frame basis discrimination.

Importance of kernels • Early SVM using polynomial and RBF kernels • Optimization problems requiring significant computational resources that were unsustainable. • Employing cluster algorithms to reduce the accuracy. • Frame level training inputs discard the useful speaker classification information. • SVM using score-space kernels • The variable length of utterance can be classified by sequence level.

Classifying sequences using score-space kernels • The score-space kernel enables SVMs to classify whole sequences. • A variable length sequence of input vectors is mapped explicitly onto a single point in a space of fixed dimension. • The score-space is derived from the likelihood score. • The likelihood ratio score-space

Computing the score-space vectors Define the global likelihood of a sequence X = {x1, …, xNl}

Computing the score-space vectors • The fixed length vectors of the likelihood ration kernel can be expressed as • The final likelihood ratio kernel is • The dimensionality of the score-space is equal to the total number of parameters in the generative models. Hence the SVM can classify the complete utterance sequences.

Experiment Results on PolyVar • The data has a noise. • The data has a much more clients tests than YOHO.

Conclusion • Add GMM-LR/SVM model in our verification system • Add score-space kernel on SVM • Need to compare the computation requirement for Fisher and LR kernels.

References • V. Wan, Speaker Verification using Support Vector Machines, University of Sheffield, June 2003 • V. Wan, Building Sequence Kernels for Speaker Verificaiton and Speech Recognition, University of Sheffield • S. Bengio, and J. Marithoz, Learning the Decision Function for the Speaker Verification, IDIAP, 2001

Speaker Verification System using SVM

Speaker Verification System using SVM

Presentation Transcript

Speaker Verification

Speaker Identification and Verification

Speaker Verification

Speaker Verification System Part B Final Presentation

Speaker Verification

Music Classification Using SVM

Speaker Verification via Kernel Methods

Adult Image Detection Using SVM

SPEAKER VERIFICATION USING SUPPORT VECTOR MACHINES

System verification

Speaker Verification

Verification of FT System Using Simulation

Phonotactic using SVM for LRE2009

EWTG Assessment Using IERM/SVM

Speaker Verification System Part A Final Presentation

Audio-visual speaker verification using continuous fused HMMs

System Functionality Verification using FPGA

Using Speaker Recognition

Speaker Identification and Verification

Text Classification using SVM-light

Automatic Attendance System Using Speaker Recognition