1 / 16

Signature with Text-Dependent and Text-Independent Speech for Robust Identity Verification

Signature with Text-Dependent and Text-Independent Speech for Robust Identity Verification. B. Ly-Van*, R. Blouet**, S. Renouard** S. Garcia-Salicetti*, B. Dorizzi*, G. Chollet** * INT, dept EPH, 9 rue Charles Fourier, 91011 EVRY France; **ENST, Lab. CNRS-LTCI, 46 rue Barrault, 75634 Paris

maille
Download Presentation

Signature with Text-Dependent and Text-Independent Speech for Robust Identity Verification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Signature with Text-Dependent and Text-Independent Speech for Robust Identity Verification B. Ly-Van*, R. Blouet**, S. Renouard** S. Garcia-Salicetti*, B. Dorizzi*, G. Chollet** * INT, dept EPH, 9 rue Charles Fourier, 91011 EVRY France; **ENST, Lab. CNRS-LTCI, 46 rue Barrault, 75634 Paris Emails: {Bao.Ly_van, Sonia.Salicetti, Bernadette.dorizzi}@int-evry.fr; {Blouet, Renouard, Chollet}@tsi.enst.fr

  2. Overview • Introduction: Why Speech and Signature? • BIOMET database: brief description • Signature data • Speech data • Writer verification • Speaker verification systems • Fusion systems • Results and Conclusions

  3. The BIOMET Database • 5 modalities: hand shape, fingerprints, on-line signatures, talking faces • 131 people: 50% male, 50% female • Data from 68 people for fusion • Time variability: two sessions spaced of 5 months • S. Garcia-Salicetti, C. Beumier, G. Chollet, B. Dorizzi, J. Leroux-Les Jardins, J. Lunter, Y. Ni, D. Petrovska-Delacretaz, "BIOMET: a Multimodal Person Authentication Database Including Face, Voice, Fingerprint, Hand and Signature Modalities", 4th International Conference on Audio and Video-Based Biometric Person Authentication, 2003.

  4. Azimuth (0°-359°) Altitude (0°-90°) 0° 270° 180° 90° Signatures capture • Captured on a digitizer : 200 Hz • WACOM Intuos2 A6 • 5 parameters: • Coordinates • Axial pressure • Azimuth and Altitude • 15 genuine per person • 12 forgeries per person

  5. Signatures modeling • Preprocessing (filtering) • Feature extraction: 12 parameters • Modeling signature: continuous HMM • 2 states, 3 gaussians per state • Bagging techniques: 10 models to build an «aggregated» model (average score) • Training: 10 signatures of one session • Normalized score: |Si(O) - Si*|

  6. Speech • Two verification systems: • Data: volontary degraded • Text-dependent: only 4 digits sequence among 10 digits (5 templates per speaker) • Text-independent: sentences extracted from the original data: • client model: trained on digits (15 seconds) and tested on sentences • world model: trained on data from 131-68 people • Methods: • Text-dependent: DTW (Dynamic Time Warping) • Text-independent: GMM (Gaussian Mixture Model)

  7. Template speech signal Sample speech signal Text-dependent (DTW) • DTW computes the spectral distance between two template patterns DTW Score

  8. WORLDGMMMODEL GMMMODELING WORLD DATA Front-end TARGETGMMMODEL TARGET SPEAKER GMM model adaptation Front-end Text-independent (GMM)

  9. HYPOTH.TARGETGMM MOD. Front-end WORLDGMMMODEL Baseline GMM method l Test Speech = LLR SCORE

  10. Fusion systems • Additive Tree Classifier (ATC) • Boosting techniques on Binary Trees • CART algorithm • Support Vector Machine (SVM) • Linear kernel • Input: • Normalized signature score • Text-dependent LLR score • Text-independent LLR score

  11. Tree-based Approach for score fusion • Goal: finding an optimal partition R = {Rk}1k  Kof the score space S=(s1, s2, s3) accordingto an Information Theory criterion • a sub-optimal solution, based on CART: • Best partition : R* = arg minR C(R) • Score estimation based on P(client|Rk) and P(world|Rk) at each node of a given tree • Use of RealAdaboost to build 50 trees per client and to obtain a robust estimation of P(client|Rk) and P(world|Rk)

  12. Verification based on ATC • A score S=(s1, s2, s3) is presented to the system composed of 50 trees : • each tree gives as output a score, based on the affected region Rk • the LLR score is computed with P(client|Rk)and P(world|Rk) • an average score is then computed with the 50 scores

  13. Separating hyperplans H , with the optimal hyperplan Ho Feature space Input space H y(X) X Class(X) Ho SVM principles

  14. Fusion experiments • The 68 people database: splitted in 2 equal parts • 34 people: Fusion Learning Base (and threshold estimation for unimodal systems with the criterion min TE) • 34 people: Fusion Test Base (and test of unimodal systems) • Per person: • 5 genuine bimodal values • 12 impostor bimodal values

  15. Model TE (%) FA (%) FR (%) Signature 11.9 [±2.7] 8.9 [±2.9] 20.1 [±6.0] Speech TI Speech 6.3 [±2.0] 2.0 [±1.4] 16.0 [±5.5] without TD Speech 10.3 [±2.6] 7.6 [±2.7] 17.0 [±5.7 noise ATC 2.8 [±1.4] 1.7 [±1.3] 5.2 [±3.3] SVM 2.7 [±1.4] 1.3 [±1.1] 5.9 [±3.6] Speech TI Speech 8.0 [±2.3] 2.0 [±1.4] 23.2 [±6.4] –10dB TD Speech 11.9 [±2.7] 7.8 [±2.7] 22.1 [±6.3] noise ATC 2.9 [±1.4] 2.5 [±1.6] 3.9 [±2.9] SVM 2.9 [±1.4] 1.9 [±1.4] 5.3 [±3.4] Speech TI Speech 17.0 [±3.1] 6.0 [±2.4] 45.0 [±7.5] 0dB TD Speech 16.5 [±3.1] 6.3 [±2.4] 42.0 [±7.4] noise ATC 6.7 [±2.1] 4.7 [±2.1] 11.2 [±4.8] SVM 5.8 [±2.0] 2.4 [±1.5] 13.6 [±5.2] Fusion Performances

  16. Conclusions • Equivalent results of ATC and SVM: • role of Boosting (ATC) • Fusion increases performance by a factor 2 relatively to the best unimodal system (in clear or noisy environments) • Other methods to create noisy environments should be tested (not gaussian white noise but real one !) • Fusion performances should also be studied only on the 2 speech verification systems, since no noise was introduced in the signature modality

More Related