360 likes | 501 Views
Continual Keystroke Biometric Authentication on Short Bursts of Keyboard Input. Ned Bakelman , John V. Monaco, Sung_Hyuk Cha, Charles C. Tappert. Overview. This study focuses on intruder detection using short bursts of keyboard input
E N D
Continual Keystroke Biometric Authentication on Short Bursts of Keyboard Input Ned Bakelman, John V. Monaco, Sung_Hyuk Cha, Charles C. Tappert
Overview • This study focuses on intruder detection using short bursts of keyboard input • Text, spreadsheet and browser input was used for experiments • For text input, performance was measured as a function of keystroke length (strong performance) • For spreadsheet and browser, performance was measured on overall samples (weaker performance) • Focus on detection in a Continual Authentication environment
Background • Biometrics • The study of human traits to identify and verify a person based on their physiological and behavioral characteristics • Physiological – Fingerprint, DNA, Iris, Facial • Behavioral – Voice, Walking Gait, Typing Rhythms Biometrics, Wikipedia (2011, October 17). Biometrics. [Online]. Available http://en.wikipedia.org/wiki/Biometrics.
Background (continued) • Pace University Keystroke Biometric System (PKBS) Guglielmo, M., Weisman, A,. Prekelezaj, E., Camilo, D., (2012). Keystroke Biometric Intrusion Detection. Conference Proceedings, Pace University, New York.
Continual Burst Strategy Burst 1 Burst 2 Burst 3 Uniform burst authentication 1 min 1 min 1 min 1 min 1 min 1 min 0 0 8 minutes 5 minutes 30 minutes 10 minutes Burst 1 Burst 2 Burst 3 Pause Threshold Pause Threshold • Continual - ongoing verification but with possible interruption • Burst Authentication - verification on short periods of computer input • Pause – period of inactivity from computer input devices (keyboard, mouse) • Continual Burst Authentication – ongoing verification occurring on short bursts only after a pause Burst authentication with pauses
Text Samples Experiment Text Input – Equal Error Rate (EER) per number of Keystrokes • Performance increases with number of keystrokes • Good performance in the 200 – 300 range • Important when considering short bursts of keyboard input
Behavioral Biometrics and Cognitive Levels • Features can be thought of as representing various human cognitive levels of the person operating the computer. Thus providing a BehavioralCognitive Fingerprint. • Keystroke and Mouse – Operate at a sub conscious ballistic motor control level • Stylometric – Focuses on characters, words, syntax, therefore operates on a linguistic level • Intruder – Operates at the semantic level of intentional motivation • For example, intruder features are stylometric in that they help determine “authorship” and therefore operate at the Linguistic level. And they’re also Semantic in that the context they occur in can help determine intentional motivation. Intruder Semantic Level Stylometry Linguistic Level Keystroke + Mouse Motor Control Level
Features Numeric Features QWERTY Numeric Keypad Separate numeric features for both keypad and qwerty. These include durations and transitions between numeric keys, arithmetic operators, etc. Wikipedia.org http://en.wikipedia.org/wiki/Computer_keyboard, last updated: March 6, 2012
Preliminary Spreadsheet Samples Experiment Spreadsheet Input • Equal Error Rate (EER) ≈ 13.5 % (fair performance) • Approximately 400 Keystrokes per sample • Mostly numeric entry using Keypad and some QWERTY
Preliminary Browser Samples Experiment Browser Input • Equal Error Rate (EER) ≈ 30 % (poor performance) • Less than 200 Keystrokes per sample • Mostly mouse input with sporadic keystroke entry
Conclusion • Main Contributions • Evaluation of text-input performance as a function of keystrokes per sample • As the number of keystrokes per sample increases so does performance (EER decreases) • Explored keystroke input from spreadsheet (numeric input) and Browser samples • Text input appears to be more robust than spreadsheet or browser input (Preliminary) • Next Steps • Explore other non-textual input such as spreadsheets and browser • Investigate intruder input • Include mouse features • Collect more data • Run more experiments
Intruder Experiment Design (continued) • Authenticate user on various window sizes, beginning 300-keystroke windows • Window Type 1: use overlapping windows to: • Minimize the “wait” period for the next authentication • Maximize fast intruder detection 300 KS 300 KS 300 KS 300 KS 300 KS 300 KS 300 KS 300 KS 300 KS 300 KS 300 KS 1650 1050 1350 150 450 750 1 300 600 900 1200 1500 1800 Keystroke Count
Intruder Experiment Design (continued) • Window Type 2: Non-overlapping windows and re-start when Pause Threshold exceeded • Assumes a pause for intruder • Negates the necessity for overlapping windows • Dependent Variable – Detection Accuracy (FAR / FRR trade off, EER etc.) • Independent Variables • Windows Size (keystroke length) • Pause / Reset time interval • New Features (numeric input / keypad input) 300 KS 300 KS 300 KS 300 KS 300 KS 1 300 600 900 1 300 600 Pause Threshold Keystroke Count
Intruder Experiment Design (continued) • Window Type 3: Spaced non-overlapping windows and re-start when Pause Threshold exceeded • Assumes a pause for intruder • No need for overlapping windows • No need for continuous checking – only authenticate after pauses and after longer time intervals • Dependent Variable – Detection Accuracy (FAR / FRR trade off, EER etc.) • Independent Variables • Windows Size (keystroke length) • Pause / Reset time interval • New Features (numeric input / keypad input) 300 KS 300 KS 300 KS 1 300 600 900 1 300 600 Pause Threshold Keystroke Count
Preliminary New Results • Experiment 1 (train on text, test on spreadsheet - Weak Training) • 5 Training User Subjects; 5 Testing User Subjects • Train: Text - 225 authentic and 1000 imposter samples • Test: Excel - 225 authentic and 1000 imposter samples
Preliminary New Results (continued) • Experiment 2 (train on spreadsheet, test on text - Weak Training) • 5 Training User Subjects; 5 Testing User Subjects • Train: Excel- 225 authentic and 1000 imposter samples • Test: Text - 225 authentic and 1000 imposter samples
Preliminary New Results (continued) • Experiment 3(train on spreadsheet, test on spreadsheet - Strong Training) • 5 Training User Subjects; 5 Testing User Subjects • Train: Excel - 50 authentic and 250 imposter samples • Test: Excel - 50 authentic and 250 imposter samples
Preliminary New Results (continued) • Experiment 4(train on spreadsheet, test on spreadsheet - Weak Training) • 5 Training User Subjects; 5 Testing User Subjects • Train: Excel - 225 authentic and 1000 imposter samples • Test: Excel - 225 authentic and 1000 imposter samples
Preliminary New Results (continued) • Experiment 5(train on browser, test on browser - Weak Training) • 5 Training User Subjects; 5 Testing User Subjects • Train: Browser - 225 authentic and 1000 imposter samples • Test: Browser - 225 authentic and 1000 imposter samples
Preliminary New Results (continued) • Experiment 6(train on browser, test on browser - Strong Training) • 5 Training User Subjects; 5 Testing User Subjects • Train: Browser - 100 authentic and 1125 imposter samples • Test: Browser - 100 authentic and 1125 imposter samples
Preliminary New Features Comparison Experiment 1 Numeric(New Features Comparison) Weak Training Spreadsheet • (No Additional Numeric Features) • Train with non team members (50) • Test with team members (50) Weak Training Spreadsheet (With Numeric Duration and Transition features) • Train with non team members (50) • Test with team members (50) • 4.5% improvement • numeric keypad and QWERTY durations (96) • numeric keypad and QWERTY transitions type II (272)
Preliminary New Features Comparison Experiment 2 Numeric(New Features Comparison) Weak Training Spreadsheet • (No Additional Numeric Features) • Train with non team members (50) • Test with team members (50) Weak Training Spreadsheet (With Numeric Duration and Transition features) • Train with non team members (50) • Test with team members (50) • 11.3% improvement • numeric keypad and QWERTY durations (96) • numeric keypad transitions type I per digit (200) • numeric keypad transitions type II per digit (200)
Preliminary New Features Comparison Experiment 3 Numeric(New Features Comparison) Weak Training Spreadsheet • (No Additional Numeric Features) • Train with non team members (50) • Test with team members (50) Weak Training Spreadsheet (With Numeric Duration and Transition features) • Train with non team members (50) • Test with team members (50) • 14.6% improvement • numeric keypad durations per digit (20) • numeric keypad transitions type I per digit (200) • numeric keypad transitions type II per digit (200)
Java Input System vs. Fimbel Keylogger Experiment 7(Similarity Comparison – Java Input System vs. Fimbel Keylogger) Strong Training Spreadsheet + Text • 50 Spreadsheet samples collected (10 per subject) • 10 Text samples collected from Input System • Split samples evenly for training and testing • (Text Samples Collected Simultaneously From Java Input System and Fimbel Keylogger) Strong Training Spreadsheet + Text • 50 Spreadsheet samples collected (10 per subject) • 10 Text samples collected from Fimbel Keylogger • Split samples evenly for training and testing • (Text Samples Collected Simultaneously From Java Input System and Fimbel Keylogger)
Features Villani, M. (2006). Keystroke biometric identification studies on long text input. Doctoral dissertation, Pace University, New York.
Fallback Hierarchy – Durations PrintScreen/SysRq ScrollLock/Pause/Break QWERTY Non Letters All Right Letters All Letters All Keys z Esc Enter F1 – F12 g v Tab x j b Numeric Keypad + Caps Lock Space All Left Letters Symbols 3 7 8 . 4 0 9 2 1 / - * = 4 arrows Home Ins End PgUp Del Alt Ctrl Shift PgDn Other Left Cons k m Other Right Cons y . Left e p s 9 u 8 o 7 t n h r 3 6 5 4 > Most Freq Cons , Left Right ! “ ; l 2 i a Left ? < Right Right ‘ 1 0 Enter d q Num Lock f ^ c Arithmetic Logic * / - w Digits/Period Center Pad + Num lock/Enter Arithmetic Other () Digits Punctuation Vowels : {}[] 6 \| 5 ~` _ $ @ & % #
Fallback Hierarchy – Numeric Transitions 4-3 Keypad / Keypad Numeric / Numeric (/*-+)-Any Digit Any Digit-(/*-+) Any Digit / Non Neighbor Any Double Digit Any Digit / Neighbor Digit (/*-+) / Any Digit Any Digit / 0 Any Digit / (/*-+) Any Digit-Any Digit 2-0 1-0 9-nn 5-4 5-nn 3-2 7-0 4-0 3-0 8-0 9-8 4-5 1-2 8-7 9-0 0-0 0-9 2-3 6-7 5-6 3-4 9-9 2-1 7-7 8-8 7-nn 6-nn 5-0 4-nn 2-nn 8-nn 1-nn 8-9 7-6 7-8 3-nn 6-0 1-1 0-nn 6-5 2-2 3-3 4-4 5-5 6-6 5-1,2,3…0 1-1,2,3…0 8-1,2,3…0 2-1,2,3…0 7-1,2,3…0 9-1,2,3…0 4-1,2,3…0 0-1,2,3…0 3-1,2,3…0 6-1,2,3…0 4-(/*-+) 6-(/*-+) 5-(/*-+) 2-(/*-+) 3-(/*-+) 8-(/*-+) 1-(/*-+) 6-(/*-+) 7-(/*-+) 3-(/*-+) 9-(/*-+) 2-(/*-+) 0-(/*-+) 9-(/*-+) 8-(/*-+) 4-(/*-+) 5-(/*-+) 1-(/*-+) 0-(/*-+) 7-(/*-+) + - (0…9) + - (0…9) - - (0…9) * - (0…9) / - (0…9) * - (0…9) - - (0…9) / - (0…9)
PKBS Old Versuses New PKBS New System Front End Data Capture Key Logger (Fimbel) Data Capture Java Application Front End Key Logger Output Keystroke Data File Feature Extraction Java Application Converter Keystroke Feature Extractor Identical Format BAS Classification Keystroke Data File Feature Data File Keystroke Feature Extractor Feature Extraction Classification BAS Feature Data File
Typing Speed • Average typing speed: 38 – 40 wpm (words per minute) • Certainly higher for experts: 50 and above wpm • Average word length: 5 or so characters per word • 300 (ks window) / 5 + 1(space bar) = 50 words • At 50 wpm: ks window occurs 1 time per 60 seconds • At 40 wpm: ks window occurs 1 time per 75 seconds Ostrach, Teresia, http://readi.info/documents/TypingSpeed.pdf, last accessed: October 13, 2011 Answers.yahoo.com. http://answers.yahoo.com/question/index?qid=20080526032554AAB28AF, last updated: May 5, 2006
Preliminary Results • Experiment 1 • 50 samples collected (10 per team member) • Split samples evenly for training and testing • Used the Excel Template to generate samples • Experiment 3 • 300 Text samples obtained from previous study (10 per test taker) • 50 Spreadsheet samples obtained from Experiment 1 • Text samples used for training • Spreadsheet samples used for tesing • Experiment 3 - Reversed • 300 Text samples obtained from previous study (10 per test taker) • 50 Spreadsheet samples obtained from Experiment 1 • Spreadsheet samples used for training • Text samples used for tesing
Preliminary Results continued • Experiment 3A • 50 Text samples obtained from previous study (10 per test taker) • 50 Spreadsheet samples obtained from Experiment 1 • Text samples used for training • Spreadsheet samples used for tesing • Experiment 3A - Reversed • 50 Text samples obtained from previous study (10 per test taker) • 50 Spreadsheet samples obtained from Experiment 1 • Spreadsheet samples used for training • Text samples used for tesing • Experiment 2 • 50 Spreadsheet samples collected (10 per team member) • 50 Spreadsheet samples collected (10 per non team member) • Split samples evenly for training and testing • Used the Excel Template to generate samples
Vector – Difference Dichotomy Model Transforms feature space into feature-vector-difference space. Two classes: within-class (same person), between–class (different people). Yoon, S., Choi, S-S., Cha, S-H., Lee, Y., & Tappert, C.C. (2005). On the individuality of the iris biometric. Proc. Int. J. Graphics, Vision & Image Processing, 5(5), 63-70.
Intruder Experiment Design (continued) Burst 1 Burst 2 Burst 3 1 min 1 min 1 min 1 min 1 min 1 min 0 0 8 minutes 5 minutes 30 minutes 10 minutes Burst 1 Burst 2 Burst 3 Pause Threshold Pause Threshold