1 / 66

Designing multiple biometric systems: Measure of ensemble effectiveness

Designing multiple biometric systems: Measure of ensemble effectiveness. Allen Tang OPLab @ NTUIM. Agenda. Introduction Measures of performance Measures of ensemble effectiveness Combination Rules Experimental Results Conclusion. INTRODUCTION. Introduction.

abril
Download Presentation

Designing multiple biometric systems: Measure of ensemble effectiveness

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Designing multiple biometric systems: Measure of ensemble effectiveness Allen Tang OPLab @ NTUIM

  2. Agenda Introduction Measures of performance Measures of ensemble effectiveness Combination Rules Experimental Results Conclusion

  3. INTRODUCTION

  4. Introduction Multimodal biometrics is better Fuse multiple biometric results Fusion at matching level is easier

  5. Introduction Which biometric experts shall we choose? How to evaluate ensemble effectiveness? Which measure gives out the best result?

  6. MEASURES OF PERFORMANCE

  7. Measures of performance Notation E={E1…Ej…EN}: a set of N experts U={ui}: the set of users sj: the set of all scores by Ej for all user sij: the score by Ej for a user ui fj(ui): function of Ej produce sij for ui th: threshold; gen: genuine; imp: impostor

  8. Measures of performance: Basic False Rejection Rate(FRR) for expert Ej: False Acceptance Rate(FAR) for expert Ej:

  9. Measures of performance: Basic p(sj|gen): Ej score probability distribution to genuine users p(sj|imp): Ej score probability distribution to impostor users Threshold(th) changes with the requirements of the application at hand

  10. Measures of performance Area under the ROC curve(AUC) Equal error rate(ERR) The “decidability” index d’

  11. Measures of performance

  12. Measures of performance: AUC Estimate AUC by Mann-Whitney statistics: This formulation of AUC is also called the “probability of correct pair-wise ranking”, as it computes the probability P( > )

  13. Measures of performance: AUC n+/n−: no. of genuine/imposter users : score set by Ej for genuine users : score set by Ej for impostor users

  14. Measures of performance: AUC Features of AUC estimated by WMW stat. : Theoretically equivalent to the value by integrating ROC curve Attain more reliable estimation of AUC in real cases(finite samples) Divide all scores sij into 2 sets: &

  15. Measures of performance: EER EER is the point of ROC curve where FAR and FRR are equal The lower the value of EER, the better the performance of a biometric system

  16. Measures of performance: d’ The d’ in the biometrics is to measure the separability of the distributions of genuine and impostor scores

  17. Measures of performance: d’ μgen/μimp: mean of genuine/impostor score distribution σgen/σimp: std. deviation of genuine/impostor score distribution The larger the d’, the better the performance of a biometric system

  18. MEASURES OF ENSEMBLE EFFECTIVENESS

  19. Measures of ensemble effectiveness 4 measures for estimating effectiveness of ensemble of biometric experts: AUC, EER, d’, and Score Dissimilarity(SD) Index But we must take the difference in performance among the experts into consideration

  20. Measures of ensemble effectiveness Generic, weighted and normalized performance measure(pm) formulation: pmδ=μpm∙ (1−tanh(σpm)) For AUC: AUCδ=μAUC∙ (1−tanh(σAUC)) The higher the AUC average, the better the performances of an ensemble of experts

  21. Measures of ensemble effectiveness For ERR: ERRδ=μERR∙ (1−tanh(σERR)) The lower the ERR average, the better the performances of an ensemble of experts For d’, consider the value of d’ that can be much larger than 1, use normalized D’=logb(1+d’) instead of d’, and base b=10 according to the values of d’ in experiments Thus D’δ=μD’∙ (1−tanh(σD’)) is used

  22. Measures of ensemble effectiveness: SD index SD index is based on the WMW formulation of the AUC, and is designed to measure the amount of improvement in AUC of the combination of an ensemble of experts SD index is a measure of the amount of AUC that can be “recovered” by exploiting the complementarity of the experts

  23. Measures of ensemble effectiveness: SD index Consider 2 experts E1 & E2, and all possible scores pairs , divide these pairs into 4 subsets S00, S10, S01, S11:

  24. Measures of ensemble effectiveness: SD index AUC of E1 & E2 are listed below, where card(Suv) is the cardinality of the subset Suv: SD index is defined as:

  25. Measures of ensemble effectiveness: SD index The higher the value of SD, the higher the maximum AUC that could be obtained by the combined scores But actual increments of AUC depends on the combination method, and high SDs usually related to low performance experts Performance measure formulation for SD: SDδ=μSD∙ (1−tanh(σSD))

  26. COMBINATION RULES

  27. Combination Rules Combination(Fusion) in this work is at the score level, as it is the most widely used and flexible combination level Investigate the performance of 4 combination methods: mean rule, product rule, linear combination by LDA, and DSS LDA & DSS require a training phase to estimate the parameters needed to perform the combination

  28. Combination Rules: Mean Rule The mean rule is applied directly to the matching scores produced by the set of N experts

  29. Combination Rules: Product Rule The product rule is applied directly to the matching scores produced by the set of N experts

  30. Combination Rules: Linear Combination by LDA Linear discriminant analysis(LDA) can be used to compute the weights of a linear combination of the scores This rule is to attain a fused score with minimum within-class variations and maximum between-class variations

  31. Combination Rules: Linear Combination by LDA Wt(W): transformation vector computed using a training set Si: vector of the scores assigned to the user ui by all the experts μgen/μimp: mean of genuine/impostor score distribution Sw: within-class scatter matrix

  32. Combination Rules: DSS Dynamic score selection(DSS) is to select one of the scores sij available for each user ui, instead of fusing them into a new score The ideal selector is based on the knowledge of the state of nature of each user:

  33. Combination Rules: DSS DSS selects the scores according estimation of the state of nature for each user, and the algorithm is based on quadratic discriminant classifier (QDC) For the estimation, a vector space is built where the vector components are the scores assigned to the user by the N experts

  34. Combination Rules: DSS Train a classifier on this vector space by using a training set related to genuine and impostor users Using the classifier to estimate the state of nature of the user After getting the estimation of the state of nature of the user, select user’s score according to (5).

  35. EXPERIMENTAL RESULTS

  36. Experimental Results: Goal Investigate the correlation between the measures of the effectiveness of the ensemble Understand final performances achieved by the combined experts, and get the best measures

  37. Experimental Results: Preparation Scores source: 41 experts and 4 DBs from open category in 3rd Fingerprint Verification Competition(FVC2004) No. of scores: For each sensor and for each expert, a total of 7750 scores, attempts from gen./imp. users are 2800/4950 For LDA & DSS training, divide scores into 4 subsets, with 700 gen. and 1238 imp. each

  38. Experimental Results: Process No. of expert pairs: 13,120(41x40x2x4) For each pair, compute the measures of effectiveness by AUC, EER, d’ and SD index Combine the pairs using 4 combination rules, then compute related values of AUC and EER to show the performance Use a graphical representation of the results of the experiments

  39. Experimental Results: AUCδ plotted against AUC

  40. Experimental Results: AUCδ plotted against AUC

  41. Experimental Results: AUCδ plotted against AUC According to graphs, AUCδ isn’t useful because no clear relationship with AUC of combination rules High AUCδ attains high AUC, but lower AUCδ gets value in wide range High AUCδ relates to high performance and similar behavior experts pair Mean rule has best AUCδ

  42. Experimental Results: AUCδ plotted against EER

  43. Experimental Results: AUCδ plotted against EER

  44. Experimental Results: AUCδ plotted against EER AUCδ is uncorrelated with the EER too Any value of AUCδ , the EER spans over a wide range of values Can not predict the performance of the combination in terms of EER by AUCδ

  45. Experimental Results: EERδ plotted against AUC

  46. Experimental Results: EERδ plotted against AUC

  47. Experimental Results: EERδ plotted against AUC Behavior better than AUCδ, but still no clear relationship between EERδ and AUC Mean rules has best result too

  48. Experimental Results: EERδ plotted against EER

  49. Experimental Results: EERδ plotted against EER

  50. Experimental Results: EERδ plotted against EER No correlation between EERδ and EER Graphs from AUCδ against EER and EERδ against EER have similar results So AUC and EER are not suitable to evaluate combination of experts, despite that they are widely used for unimodal biometric system

More Related