Evaluation of Biometric Identification Systems

Evaluation of Biometric Identification Systems Dr. Bill Barrett, CISE department and US National Biometric Test Center San Jose State University email: wbarrett@email.sjsu.edu

The Biometric Test Center • Funded by several federal agencies • Centered in a disinterested university setting • Provide objective evaluations of commercial biometric instruments • Provide consulting services to sponsors regarding the most effective application of biometric instruments • No funds accepted from vendors • No independent competing research

Summary of Presentation • Three Basic Biometric Operations • Measures of effectiveness - the ROC curve • Comparison Rate Measures • Collection Variables • Evaluation strategies • Testing issues • Some results • Conclusion

Three Basic Biometric Operations

Enrollment: first time in. Verification: does this credit card belong to this person? Identification: who is this person, anyway? The Three Biometric Operations

Enrollment

Verification

Identification

COMPARE operation • Yields a DISTANCE measure between candidate C and template T • d = distance(C, T) • d LARGE: C probably is NOT T • d SMALL: C probably IS T • NOTE: is reversed for fingerprints

Distance Measures • Euclidean • Hamming • Mahalonobis

Variations on the basic measurement plan • 3 strikes and you’re out • Multiple templates of same person • Template replacement over time • Template averaging • Binning

Binning • Find some way to segment the templates, e.g. • male/female • particular finger • loop vs. whorl vs. arch • May have to include the same template in different bins • Improves search performance, may reduce search accuracy (more false non-matches)

Measures of Effectiveness

Distributions of Large Ensemble of Candidates

Integrated Distributions

Cross-over Threshold • tc = cross-over threshold • where probability of false match: I = probability of false rejection: A

Changing the Device Threshold • td > tc : reduces false rejection: A increases false match: I (bank ATM choice) • td < tc : increases false rejection: A reduces false match: I (prison guard choice)

The d-prime Measure • Measures the overall quality of a biometric instrument. • d’ usually in the range of 2 to 10, logarithmic, like the Richter Scale. • Assumes normal distribution.

Comparison Rate Measures

Penetration Rate • Percentage of templates that must be individually compared to a candidate, given some binning. • Search problem: usually exhaustive search, with some comparison algorithm, no reliable tree or hash classification. • Low penetration rate implies faster searching

Example: fingerprints AFIS (FBI automated classification system) classifies by: • Left loop/ right loop • Arch/whorl • Unknown Then • Exhaustive search of the subset of prints

Jain, Hong, Pankanti & Bolle, An Identity-Authentication System Using Fingerprints, Proc. IEEE vol. 85, No. 9, Sept. 1997

Bin Error Rate • Probability that a search for a matching template will fail owing to an incorrect bin placement • Related to confidence in the binning strategy • AFIS Bin error typically < 1%

Collection Variables

Collection Variables • Physical variations during biometric collection that may change the measurement • Translation/scaling/rotation usually compensated in software • Tend to increase the width of the authentics distribution, and thus • ...make it easier to get a false rejection • ...cause a smaller d’

Liveness Issue Can the device detect that the subject is live? • Fake face recognition with a photograph? • ...or a rubber print image (fingerprint)? • ...or a glass eye (iris encoding)?

Collection Variables -- Fingerprints • Pressure • Angle of contact • Stray fluids, film buildup • Liveness

Collection Variables - Hand Geometry • Finger positioning (usually constrained by pins) • Rings • Aging • Liveness

Collection Variables -Iris Identification • Lateral angle of head • Focus quality • Some people have very dark irises; hard to distinguish from pupil • Outer diameter of iris difficult to establish • Eyelids, lashes may interfere • NO sunglasses • Liveness can be established from live video

Collection Variables -Palm Print • Pressure • Stray fluids, film buildup • Liveness

Collection Variables -Face Recognition • 3D angles • Lighting • Background • Expression • Hairline • Artifacts (beard, glasses) • Aging • Liveness: smiling, blinking

Collection Variables -Voice Recognition • Speed of delivery • Articulation • Nervousness • Aging • Laryngitis • Liveness: choose speech segments for the user to repeat, i.e. “Say 8. Say Q. Say X”

Example - Miros Face Recognition System • Lighting is specified • Static background, subtracted from candidate image to segment face • Camera mounted to a wall - standing candidate • Height of eyes above floor used as an auxiliary measure • Verification only recommended • Liveness - can be fooled with a color photograph

Example - FaceitTM Face Recognition System • No particular lighting specified; it expects similar lighting & expression of candidate and template • Face segmented from background using live video • Face lateral angles not well tolerated • Liveness: blinking, smiling test

Evaluation Strategies

Common Factors • Bio capture: easy to capture the full image • Bio encoding algorithm: often proprietary • Bio encoding: usually proprietary • Database distance: may be proprietary

Convenience Factors • Many are concerned about intrusiveness • Some are concerned about touching • What is the candidate’s learning curve? ...device may require some training

Collecting a Template Database for Testing • Precise identity : code registration • Getting plenty of variety: gender, age, race • Getting many images of same identity • Getting many different images • Significant time frame

Practical Databases • Many large template databases with unique identities & single images available • Many large databases with inaccurate identity correlation • Many databases with limited diversity • Difficult to collect data over time

Some Results

Hand Geometry for INSPASS • INSPASS: INS Passenger Accelerated Service System • Collected 3,000 raw transaction records • Unique individuals in database (separate magnetic identity card) • ...from three international airports • Statistical modelling is suspect for this data • Experimental d’ is 2.1; equal error rate ~2.5%

J. L. Wayman, Evaluation of the Inspass Hand Geometry Data, 1997

FaceitTM General Comments • Supported by a flexible Software Development Kit (SDK), using Microsoft Visual C++TM • Several example applications • Well documented • Can use any video camera • Segments a face with motion video • Liveness: smile or eye blink

FaceitTM Face Recognition • No lighting conditions specified • No background conditions specified • Multiple faces can be segmented • The database includes full images, with default limit of 100 templates • Image conversion and code comparison is separated, therefore is testable

FaceitTM Face Recognition

FaceitTM Study Summary • Done by senior computer engineering students • Not a fully diversified, controlled experiment • 50 different persons, 10 images each • Overall time frame ~ 2 months • Equal error rate crossover point ~5.5%

MirosTM Face Recognition • Designed for verification • Lighting conditions specified • Static background - system takes snapshot of background, uses it to segment a face • Keyboard code plus two images • Double image helps liveness • Software Development Kit similar to Faceit • Equal crossover error rate ~5%

IriscanTM Recognition System • Based on a patent by John Daugman, US 5,291,560, Mar. 1, 1994 • Uses iris patterns laid down a few months after birth. Claims no significant aging over lifespan. • Claims high d’, yielding cross-over error rate < 1 in 1.2 million • Claims high rate of code comparison. Hamming distance. ~100,000 IrisCodes/second on a PC.

IrisScanTM Performance Face Recognition, Spring-Verlag 1998

IriscanTM Observations • Capture equipment more expensive: zoom / telephoto / swivel robotics / autofocus • Question of conversion and code standardization: most of the system is proprietary • Liveness • Promises to have the highest discrimination of all

Evaluation of Biometric Identification Systems