340 likes | 494 Views
Masquerade Detection. Mark Stamp. Masquerade Detection. Masquerader --- someone who makes unauthorized use of a computer How to detect a masquerader ? Here, we consider… Anomaly-based intrusion detection (IDS) Detection is based on UNIX commands
E N D
Masquerade Detection Mark Stamp Masquerade Detection
Masquerade Detection • Masquerader --- someone who makes unauthorized use of a computer • How to detecta masquerader? • Here, we consider… • Anomaly-based intrusion detection (IDS) • Detection is based on UNIX commands • Lots and lots of prior work on this problem • We attempt to apply PHMMs • For comparison, we also implement other techniques (HMM and N-gram) Masquerade Detection
Schonlau Data Set • Schonlau, et al, collected large data set • Contains UNIX commands for 50 users • 50 files, one for each user • Each file has 15k commands, 5k from user plus 10kfor masquerade test data • Test data: 100 blocks, 100 commands each • Dataset includes map file • 100 rows (test blocks), 50 columns (users) • 0 if block is user data, 1 if masquerade data Masquerade Detection
Schonlau Data Set • Map file structure • This data set used for many studies • Approximately, 50 published papers Masquerade Detection
Previous Work • Approaches to masquerade detection • Information theoretic • Text mining • Hidden Markov models (HMM) • Naïve Bayes • Sequences and bioinformatics • Support vector machines (SVM) • Other approaches • We briefly look at each of these Masquerade Detection
Information Theoretic • Original work by Schonlau included a compression technique • Based on theory (hope?) that legitimate commands compress more than attack • Results were disappointing • Some additional recent work • Stillnot competitive with best approaches Masquerade Detection
Text Mining • A few papers in this area • One approach extracts repetitive sequences from training data • Another paper use principal component analysis (PCA) • Method of “exploratory data analysis” • Good results on Schonlau data set • But high cost during training phase Masquerade Detection
Hidden Markov Models • Several authors have usedHMMs • One of the best known approaches • We have implemented HMM detector • We do sensitivity analysis on the parameters • In particular, determine optimal N (number of hidden states) • We also use HMMs for comparison with our PHMM results Masquerade Detection
Naïve Bayes • In simplest form, relies only on command frequencies • That is, no sequence info is used • Several papers analyze this approach • Among the simplest approaches • And, results are good Masquerade Detection
Sequences • In a sense, this is the opposite extreme from naïve Bayes • Naïve Bayes only considers frequency stats • Sequence/bioinformatics focused on sequence-related information • Schonlau’s original work included elementary sequence-based analysis Masquerade Detection
Bioinformatics • Weare aware ofonly one previous paper that uses bioinformatics approach • Use Smith-Waterman algorithm to create local alignments • Alignments then used directly for detection • In contrast, we do pairwise alignments, MSA, PHMM • PHMM is used for scoring (forward algorithm) • Our scoring is much more efficient • Also, our results are at least as strong Masquerade Detection
Support Vector Machines • Support vector machines (SVM) • Machine learning technique • Separate data points (i.e., classify) based on hyperplanes in high dimensional space • Original data mapped to higher dimension, where separation is likely easier • SVMs maximize separation • And have low computational costs • Used for classification and regression analysis Masquerade Detection
SVMs & Masquerade Detection • SVMs have been applied to masquerade detection problem • Results are good • Comparable to naïve Bayes • Recent work using SVMs focused on improved efficiency Masquerade Detection
Other Approaches • The following have also been studied • Detect using low frequency commands • Detect using high frequency commands • Hybrid Bayes “one step Markov” • Natural to consider hybrid approaches • Multistep Markov • Markov process of order greater than 1 • None of these particularly successful Masquerade Detection
Other Approaches (Continued) • Non-negative matrix factorization (NMF) • At least 2 papers on this topic • Appears to be competitive • Otherhybrids that attempt to combine several approaches • So far, no significant improvement over individual techniques Masquerade Detection
HMMs • See previous presentation Masquerade Detection
HMM for Masquerade Detection • Using the Schonlau data set we… • Train HMM for each user • Set thresholds • Test the models and plot results • Note that this has been done before • Here, we perform sensitivity analysis • That is, we test different number of hidden states, N • Also use it for comparison with PHMM Masquerade Detection
HMM Experiments • Plotted as “ROC” curves • Closer to origin is better • Useful region • That is, false positives below 5% • The shaded region Masquerade Detection
HMM Conclusion • Number of hidden states does not matter • So, use N=2 • Since most efficient Masquerade Detection
PHMM • See previous presentation Masquerade Detection
PHMM Experiments • A problem with Schonlau data… • For given user, 5000 commands • No begin/end session markers • So,must split it upto obtain multiple sequences • But where to split sequence? • And what about tradeoff between number of sequences and length of each sequence? • That is, how to decide length/number??? Masquerade Detection
PHMM Experiments • Experiments done for following cases: • See next slide… Masquerade Detection
PHMM Experiments • Tests various numbers of sequences • Best results • 5 sequences, 1k commands each seq. • This case in next slide Masquerade Detection
PHMM Comparison • Compare PHMM to “weighted N-gram” and HMM • HMM is best • PHMM is competitive Masquerade Detection
PHMM Detector • PHMM at disadvantageon Schonlau data • PHMM uses positional information • Such info not availableforSchonlaudata • We have to guess the positions for PHMM • How to get fairer comparison between HMM and PHMM? • We need different data set • Only optionis simulated data set Masquerade Detection
Simulated Data • We generate simulated data as follows • Using Schonlau data, construct Markov chain for each user • Use resulting Markov chain to generate sequences representing user behavior • Restrict “begin” to more common commands • What’s the point? • Simulated seqs have sensible begin and end Masquerade Detection
Simulated Data • Training data and user data for scoring generated using Markov chain • Attack data taken from Schonlau data • How much data to generate? • First test, we generate same amount of simulated data as is in Schonlau set • That is, 5k commands per user Masquerade Detection
DetectionwithSimulated Data • PHMM vs HMM • Round 2 • It’s close, but HMM still wins! Masquerade Detection
Limited Training Data • What if less training data is available? • In a real application, initially, training data is limited • Can’t detect attacks until sufficient training data has been accumulated • So, less data required, the better • Experiments, using simulated data, limited training date • Used 200 to 800 commands for training Masquerade Detection
Limited Training Data • PHMM vs HMM • Round 3 • With 400 or less, PHMM wins big! Masquerade Detection
Conclusion • PHMMis competitive with best approaches • PHMM likely to do better, given better training data (begin/end info) • PHMM much better than HMM when limited training data available • Of practical importance • Why does it make sense that PHMM would do better with limited training data? Masquerade Detection
Conclusion • Given current state of research… • Optimal masquerade detection approach • Initially, collect small training set • Train PHMM and use for detection • No attack, then continue to collect data • When sufficient data available, train HMM • From then on, use HMM for detection Masquerade Detection
Future Work • Collect better real data set!!! • Many problems/limitations with Schonlau data • Improved data set could be basis for lots and lots of research • Directly compare PHMM/bioinformatics approaches with previous work (HMM, naïve Bayes, SVM, etc., etc.) • Consider hybrid techniques • Other techniques? Masquerade Detection
References • Masquerade detection using profile hidden Markov models, L. Huang and M. Stamp, to appear in Computers and Security • Masquerading user data, M. Schonlau Masquerade Detection