220 likes | 361 Views
SecurePhone A Multi -M odal Biometric Verifier for constrained devices. Sabah Jassim University of Buckingham, UK. Outline. The SecurePhone project Fusion approaches to biometric-based Identification SecurePhone multi-Modal Biometric verifier PDA Implementation Constraints
E N D
SecurePhone AMulti-Modal Biometric Verifier for constrained devices Sabah Jassim University of Buckingham, UK.
Outline • The SecurePhone project • Fusion approaches to biometric-based Identification • SecurePhone multi-Modal Biometric verifier • PDA Implementation Constraints • Modalities • Fusion strategy • Performance: Match on Host (Moh) & Mach on Card (Moc) • Challenges and Potential solutions • Conclusion BioSecure + COST 2101 - March 2007
The SP consortuim The SecurePhone Project • Aims to produce a prototype of a new mobile communication system enabling biometrically authenticated users to deal legally binding m-contracts during a mobile phone call in an easy yet highly dependable and secure way using a biometric recogniser that fuses face, voice and handwritten signature. BioSecure + COST 2101 - March 2007
SecurePhone aim 1: secure exchange • Secure PKI (Public Key Infrastructure) Deal secure m-contracts during a mobile phone call • secure: private key stored on SIM card • user-friendly: intuitive, non-intrusive • flexible: legally binding text/audio transactions • dynamic: mobile e-signing “on the fly” BioSecure + COST 2101 - March 2007
Project aim 2: biometric verification face voice signature preprocessing preprocessing preprocessing modelling modelling modelling fusion client & impostor joint-score models accept userrelease private key reject user • Zero-Knowledge Authentication. BioSecure + COST 2101 - March 2007
Implementation constraints • PDA main processor is such slower processing power than PC. • Thus even on PDA verification must be very efficient. • Inadequate Audio-Visual signal sample rate using the device applications (only 8 kHz for audio and 10 fps video). • Succeeded to improved. Current SP sampling and real time pre-processing is 22 kHz audio and 20 fps video signals. • Only data on the SIM is secure, so must store and process the biometric models/templates on the SIM. Yet the SIM has very limited computational resources and processing support • SIM model storage is limited to 40 K: text-dependent promptsNote: text-independent prompts or varied text-dependent prompts are more secure, but would require 200-400 K. • Enrolment should be based on a short session (acceptability) BioSecure + COST 2101 - March 2007
Voice verification (SU / GET ENST) • Fixed 5-digits prompt – conceptually neutral, easily extendable, requires few Gaussians • 22 KHz sampling • Online energy based non-speech frame removal • MFCCs with online CMS and first time difference features – slow to compute, but fixed point faster than floating point • Features modelled by 100-Gaussian GMM pdf, with UBM for model initialisation and score normalisation • Training on data from 2 indoor and 2 outdoor recordings from one session. Testing on similar data from another session BioSecure + COST 2101 - March 2007
Signature verification (GET INT) • 2D coordinates (100 Hz) augmented by time difference features, curvature, etc. – total 19 featuresNote: no pressure or angles available, since obtained from PDA’s touch screen, not from writing pad • Shift normalisation, but no rotation or scaling • Features modelled by 100 Gaussian GMM pdf – UBM used for model initialisation and score normalisation • Fast to compute • Training and testing on data from one session BioSecure + COST 2101 - March 2007
Face Wavelet feature Representation (BU) • The Discrete Wavelet Transform (DWT) decomposes an image into a set of different frequency subbands with different resolutions, each consisting of • At a resolution depth of k, the pyramidal scheme decomposes an image I into 3k + 1 subbands: (LLk, HLk, LHk, HHk, . . . , HL1, LH1, HH1). The lowest-pass subband LLkrepresents the k-level resolution approximation of the image I. The subbands HL1, LH1, and HH1contain finest scale wavelet coefficients, and the coefficients get coarser as k increases, LLkbeing the coarsest. • Each subband of DWT-decomposed face image represents the person’s face at different frequency ranges and different scales (i.e. a distinct stream for face recognition with varying accuracy rates that can be fused for improved accuracy). BioSecure + COST 2101 - March 2007
Face verification (BU) • Static face recognition – 10 grey-scale images selected at random from a video, face area 160x192 pixels • Histogram equalisation and z-score standardisation of features are applied as simple fast light normalisation. • Haar wavelet low-low-4 (or low-high) subband as feature vectors • Other wavelet filters were tested but Haar is the fastest to compute • Features modelled by only 4 Gaussian GMM pdf – UBM used for model initialisation and score normalisation • Training on data from 2 indoor and 2 outdoor recordings from one session, testing on similar data from another session BioSecure + COST 2101 - March 2007
Fusion (GET INT) • For each modality S(i) = log p(Xi|C) - log p(Xi|I) • Score fusion was tested by: • Optimal linear weighted sum:Fused-scores = w(i) * S(i) • sum is taken over the 3 modalities • GMM scores modelling, i.e. modelling both client and impostor joint score pdf’s by diagonal covariance GMMs:Fused-score = log p(S|C) - log p(S|I) BioSecure + COST 2101 - March 2007
7 9 8 5 1 Press to start/stop speaking start/stop User verification system • User requests PDA to verify their identity • PDA requests user to • read prompt (face in box) • sign signature • Feature processing applied to each modality[silence removal, histogram equalisation, MFCC or Haar wavelets, online CMS, delta features, etc.] • for each modality S(i)=log p(Xi|C)-log p(Xi|I) • if S(i) < θ(i) for any (i) please repeat • else fused-score = log p(S|C) - log p(S|I) • if fused-score > φuser accepted • else user rejected BioSecure + COST 2101 - March 2007
Speaking face & Forgery (GET ENST) • Investigated possible attacks and forgery scenarios: • using synthesised voice and face • Difficult to create – synchronisation problems • Replay attacks – devised a successful attack whereby the client voice and face images but not the same video. • Used coupled HMM for voice and face reduced greatly the effect of this attack. BioSecure + COST 2101 - March 2007
PDA Database (PDAtabase) • After initial development with many databases [TIMIT(V), CSLU(V), BANCA(V,F), ORL(F), BIOMET(V,F,S), NIST(V)] • CSLU/BANCA-like database recorded on Qtek2020 PDA for realistic conditions (sensors, environment) • 60 English subjects: 24 for UBM, 18 for g1, 18 for g2. Accept/reject threshold optimised on g1evaluated on g2, vice versa • Video (voice + face): 18 prompts from (5-digit, 10-digit and phrase);3 sessions, with 2 inside and 2 outside recordings per session • Signatures in one session, 20 expert impostorisation for each • Virtual couplings of audio-visual with signature data (independent) • Automatic test script allows to test many possible configuration • User just provides executables for feature modelling, scores generation and scores fusion BioSecure + COST 2101 - March 2007
Match on Host (MoH): complementarity of modalities For LL subband. Already have improved results for LH subband! Result table with improved results for 5-digit and 10-digit prompts in PDAtabase (SPIE 2006) BioSecure + COST 2101 - March 2007
Match on Card (MoC) • Implementation of the MoH system on the SIMcard (MoC) • No problem in terms of storage • But is not feasible because of verification time (matching plus host/SIM communication = one hour ) A reduction of the verification time can be attained by • reducing the vector size • reducing the frame rate • reducing the number of Gaussians of the client and background models Matching time was still not acceptable BioSecure + COST 2101 - March 2007
MoC bottleneck • Not in preprocessing, since this is still all done on the PDA, as in the MoH system. • Not in face: • Although feature vectors are • Only a few (10) of them in testing • and only 4 Gaussians needed (client model and UBM) • Bottleneck caused by voice and signature data: • Vectors are relatively small, • large number of frames • large number of Gaussians BioSecure + COST 2101 - March 2007
MoC solution • Only a drastic measure can solve the problem: • Globalised features: • Features to represent the whole signature: a single vector of 41 parameters representing correlation and variation in x-y coordinates, velocity and acceleration parameters • Idea generalized to voice: use of means (cf. Long-Term Average Spectrum) and standard deviations per vector parameters across all frames • Works well for signature • Improvement: • use up to four equal subparts of signature/voice signal • Implementation: 2 equal subparts BioSecure + COST 2101 - March 2007
MoC-emulated results EER (percent) for globalised means (columns 2-5) and means plus standard deviations (columns 6-9) for voice and sinature divided into two equal subparts BioSecure + COST 2101 - March 2007
Solving the capacity problem • Possible options for improving performance of the SecurePhone: • Use match-on-server (MoS) - Security and privacy concern. • Implement the Biometric Recognizer and Encryption on a chip (more costly than current solution) • Build a secure PDA with sufficient storage and processing power (A dedicated device that would be more costly and less ubiquitous). • Split matching (hybrid MoC/MoH) considered but not implemented. Initial work is being done and results are encouraging. Promising implications for security and privacy of biometrics data (templates/models)without cryptography. BioSecure + COST 2101 - March 2007
Conclusion and Future Work • Natural, non-intrusive biometrics guarantee high user acceptance • Biometric data never leave the SIM-card.High security • Fusion of Multi-streams of single trait can lead to improved in performance (A pilot for Face was tested but not implemented in SP) • MoH is efficient with high accuracy, but vulnerable. • MoC is secure, efficiency and high accuracy cannot happen together! • Future work include: • Designing hybrid mixed client-server matching. • Investigating the privacy and security of Biometric data, using Cancellable Biometrics, specially for “Match on Server” • Improving performance of single modalities through the multi-classifier & multi-stream strategies. e.g. Face by mixing larger number of subbands at different depths BioSecure + COST 2101 - March 2007
AcknowledgementThanks to EU for funding this research through the SecurePhone(IST-2002-506883) project. BioSecure + COST 2101 - March 2007