1 / 34

Masquerade Detection

Masquerade Detection. Mark Stamp. Masquerade Detection. Masquerader --- someone who makes unauthorized use of a computer How to detect a masquerader ? Here, we consider… Anomaly-based intrusion detection (IDS) Detection is based on UNIX commands

trang
Download Presentation

Masquerade Detection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Masquerade Detection Mark Stamp Masquerade Detection

  2. Masquerade Detection • Masquerader --- someone who makes unauthorized use of a computer • How to detecta masquerader? • Here, we consider… • Anomaly-based intrusion detection (IDS) • Detection is based on UNIX commands • Lots and lots of prior work on this problem • We attempt to apply PHMMs • For comparison, we also implement other techniques (HMM and N-gram) Masquerade Detection

  3. Schonlau Data Set • Schonlau, et al, collected large data set • Contains UNIX commands for 50 users • 50 files, one for each user • Each file has 15k commands, 5k from user plus 10kfor masquerade test data • Test data: 100 blocks, 100 commands each • Dataset includes map file • 100 rows (test blocks), 50 columns (users) • 0 if block is user data, 1 if masquerade data Masquerade Detection

  4. Schonlau Data Set • Map file structure • This data set used for many studies • Approximately, 50 published papers Masquerade Detection

  5. Previous Work • Approaches to masquerade detection • Information theoretic • Text mining • Hidden Markov models (HMM) • Naïve Bayes • Sequences and bioinformatics • Support vector machines (SVM) • Other approaches • We briefly look at each of these Masquerade Detection

  6. Information Theoretic • Original work by Schonlau included a compression technique • Based on theory (hope?) that legitimate commands compress more than attack • Results were disappointing • Some additional recent work • Stillnot competitive with best approaches Masquerade Detection

  7. Text Mining • A few papers in this area • One approach extracts repetitive sequences from training data • Another paper use principal component analysis (PCA) • Method of “exploratory data analysis” • Good results on Schonlau data set • But high cost during training phase Masquerade Detection

  8. Hidden Markov Models • Several authors have usedHMMs • One of the best known approaches • We have implemented HMM detector • We do sensitivity analysis on the parameters • In particular, determine optimal N (number of hidden states) • We also use HMMs for comparison with our PHMM results Masquerade Detection

  9. Naïve Bayes • In simplest form, relies only on command frequencies • That is, no sequence info is used • Several papers analyze this approach • Among the simplest approaches • And, results are good Masquerade Detection

  10. Sequences • In a sense, this is the opposite extreme from naïve Bayes • Naïve Bayes only considers frequency stats • Sequence/bioinformatics focused on sequence-related information • Schonlau’s original work included elementary sequence-based analysis Masquerade Detection

  11. Bioinformatics • Weare aware ofonly one previous paper that uses bioinformatics approach • Use Smith-Waterman algorithm to create local alignments • Alignments then used directly for detection • In contrast, we do pairwise alignments, MSA, PHMM • PHMM is used for scoring (forward algorithm) • Our scoring is much more efficient • Also, our results are at least as strong Masquerade Detection

  12. Support Vector Machines • Support vector machines (SVM) • Machine learning technique • Separate data points (i.e., classify) based on hyperplanes in high dimensional space • Original data mapped to higher dimension, where separation is likely easier • SVMs maximize separation • And have low computational costs • Used for classification and regression analysis Masquerade Detection

  13. SVMs & Masquerade Detection • SVMs have been applied to masquerade detection problem • Results are good • Comparable to naïve Bayes • Recent work using SVMs focused on improved efficiency Masquerade Detection

  14. Other Approaches • The following have also been studied • Detect using low frequency commands • Detect using high frequency commands • Hybrid Bayes “one step Markov” • Natural to consider hybrid approaches • Multistep Markov • Markov process of order greater than 1 • None of these particularly successful Masquerade Detection

  15. Other Approaches (Continued) • Non-negative matrix factorization (NMF) • At least 2 papers on this topic • Appears to be competitive • Otherhybrids that attempt to combine several approaches • So far, no significant improvement over individual techniques Masquerade Detection

  16. HMMs • See previous presentation Masquerade Detection

  17. HMM for Masquerade Detection • Using the Schonlau data set we… • Train HMM for each user • Set thresholds • Test the models and plot results • Note that this has been done before • Here, we perform sensitivity analysis • That is, we test different number of hidden states, N • Also use it for comparison with PHMM Masquerade Detection

  18. HMM Experiments • Plotted as “ROC” curves • Closer to origin is better • Useful region • That is, false positives below 5% • The shaded region Masquerade Detection

  19. HMM Conclusion • Number of hidden states does not matter • So, use N=2 • Since most efficient Masquerade Detection

  20. PHMM • See previous presentation Masquerade Detection

  21. PHMM Experiments • A problem with Schonlau data… • For given user, 5000 commands • No begin/end session markers • So,must split it upto obtain multiple sequences • But where to split sequence? • And what about tradeoff between number of sequences and length of each sequence? • That is, how to decide length/number??? Masquerade Detection

  22. PHMM Experiments • Experiments done for following cases: • See next slide… Masquerade Detection

  23. PHMM Experiments • Tests various numbers of sequences • Best results • 5 sequences, 1k commands each seq. • This case in next slide Masquerade Detection

  24. PHMM Comparison • Compare PHMM to “weighted N-gram” and HMM • HMM is best • PHMM is competitive Masquerade Detection

  25. PHMM Detector • PHMM at disadvantageon Schonlau data • PHMM uses positional information • Such info not availableforSchonlaudata • We have to guess the positions for PHMM • How to get fairer comparison between HMM and PHMM? • We need different data set • Only optionis simulated data set Masquerade Detection

  26. Simulated Data • We generate simulated data as follows • Using Schonlau data, construct Markov chain for each user • Use resulting Markov chain to generate sequences representing user behavior • Restrict “begin” to more common commands • What’s the point? • Simulated seqs have sensible begin and end Masquerade Detection

  27. Simulated Data • Training data and user data for scoring generated using Markov chain • Attack data taken from Schonlau data • How much data to generate? • First test, we generate same amount of simulated data as is in Schonlau set • That is, 5k commands per user Masquerade Detection

  28. DetectionwithSimulated Data • PHMM vs HMM • Round 2 • It’s close, but HMM still wins! Masquerade Detection

  29. Limited Training Data • What if less training data is available? • In a real application, initially, training data is limited • Can’t detect attacks until sufficient training data has been accumulated • So, less data required, the better • Experiments, using simulated data, limited training date • Used 200 to 800 commands for training Masquerade Detection

  30. Limited Training Data • PHMM vs HMM • Round 3 • With 400 or less, PHMM wins big! Masquerade Detection

  31. Conclusion • PHMMis competitive with best approaches • PHMM likely to do better, given better training data (begin/end info) • PHMM much better than HMM when limited training data available • Of practical importance • Why does it make sense that PHMM would do better with limited training data? Masquerade Detection

  32. Conclusion • Given current state of research… • Optimal masquerade detection approach • Initially, collect small training set • Train PHMM and use for detection • No attack, then continue to collect data • When sufficient data available, train HMM • From then on, use HMM for detection Masquerade Detection

  33. Future Work • Collect better real data set!!! • Many problems/limitations with Schonlau data • Improved data set could be basis for lots and lots of research • Directly compare PHMM/bioinformatics approaches with previous work (HMM, naïve Bayes, SVM, etc., etc.) • Consider hybrid techniques • Other techniques? Masquerade Detection

  34. References • Masquerade detection using profile hidden Markov models, L. Huang and M. Stamp, to appear in Computers and Security • Masquerading user data, M. Schonlau Masquerade Detection

More Related