250 likes | 434 Views
Machine learning in IDS. March 15, 2004. Source Papers. T. Lane and C. E. Brodley An application of machine learning to anomaly detection , NIST-NCSC National Information Systems Security Conference, 1997
E N D
Machine learning in IDS March 15, 2004
Source Papers • T. Lane and C. E. Brodley An application of machine learning to anomaly detection, NIST-NCSC National Information Systems Security Conference, 1997 • J. Ryan, M. Lin, R. Miikkulainen Intrusion Detection with Neural Networks, MIT Press, 1998 • A. K. Ghosh, A. Schwatzbard and M. Shatz Learning Program Behavior Profiles for Intrusion Detection, USENIX Workshop on Intrusion Detection and Network Monitoring, 1999 • D. Endler Intrusion detection: Applying machine learning to solaris audit data, ACSAC'98
Two Major Approaches • Misuse detection – define intrusions ahead of time and watch for their occurrence • Can detect well-known attacks via patterns • Future attacks cannot be preemptively detected • Anomaly detection – detect behavior that deviates from normal system use • Learn a normal system activity profile • Can abstract information about normal behavior to detect attacks
Basic Terminology • Concept Drift – behavioral changes undergone by valid users during normal use • On-line systems • Run in real-time with users • Computationally expensive • Off-line systems • Run against stored user data at a scheduled time • Cannot respond in real-time
Paper #1 • IDS must learn characteristic sequences of actions • These sequences differ on a per-user basis • Characteristic differences between these sequences differentiate valid users from intruders • Use the sequence as the fundamental unit of comparison • Omit filenames for privacy and focus on behavior instead of content
Paper #1 • Parse the command stream into a token stream: > ls –laf > cd /tmp > gunzip –c foo.tar.gz | (cd \ ; tar xf -) becomes… ls –laf cd <1> gunzip –c <1> | ( cd <1> ; tar - <1> ) • This token stream is stored in the dictionary, along with a similarity measure and a set of system parameters
Paper #1 • Compute a numerical similarity measure for pairs of sequences that have close resemblance
Paper #1 • Collected data from four users • Experimented with different analysis methods • Sequence length had a major effect on accuracy • Dictionary must be kept small to avoid false positives, and for performance reasons • The problem of informed, malicious users • The system performed well, some caveats • No concept drift • Novice users
Paper #2 • Describes the NNID (Neural Network Intrusion Detector) • Works off-line, identifies behavior using the distribution of commands a user executes • Selected 100 commands to describe the user’s behavior
Paper #2 • A machine was selected that had 10 users, for a total of 89 user-days • The network was trained on 8 randomly chosen days of data and then tested against the remaining 4 days of data • Two separate tests were run • Identifying remaining vectors • Identifying randomly-generated vectors
Paper #2 • Identified user vectors 93% of the time • False alarm rate of 7% • Rejected 63% of the random user vectors • Had an anomaly detection rate of 96% • All the false alarms were the same user, and were attributed to lack of data
Paper #2 • Overall, the system was a success • How well does the system scale with more users? • To what extent does user behavior change over time?
Paper #3 • Three algorithms were experimented with: • Table lookup • Backpropagation network • Elman network • These three algorithms range from memorization to generalization
Paper #3 • Equality matching is simple but effective • Data is partitioned into fixed-size windows • For analysis, data is compared to a ROC (Receiver Operating Characteristics) curve • This curve is essentially an intrusive measure that calculates the probability of intrusion
Paper #3 • A backpropagation network attempts to learn from network behavior • Multiple networks were trained for each program, and the best was kept • Networks were fed random data to generalize everything as anomalous • Allows single anomalies, but recognizes sequences of anomalies
Paper #3 • An Elman network can recognize recurrent features in the input • Perform classification of short sequences of events as they occur within a larger stream of events • The Elman network was the least tuned, but most successful
Paper #3 • Overall results
Paper #4 • Utilized the Solaris SHIELD Basic Security Module (BSM) for user audit data • Perl script parsed the BSM data into separate audit files for four different users
Paper #4 • Testing data consisted of normal sessions, interspersed with simulated account break-ins • Number of signal features was reduced to 13 from 488 • Ideal window size was determined to be 6
Paper #4 • Ultimately, the best solution was a combination of both anomaly and misuse detection
Common Problems • If an intruder can breach the system during the learning phase, the system can learn the malicious behavior • All tests were performed against low user numbers • No real-world testing was performed
Summary • Creating system usage “fingerprints” is a valid methodology for IDS • Systems can be run both on-line and off-line depending on the configuration needed • Real-world testing required before implementation