Learning Rules for Anomaly Detection of Hostile Network Traffic

Learning Rules for Anomaly Detection of Hostile Network Traffic Matthew V. Mahoney and Philip K. Chan Florida Institute of Technology

Problem: How to detect novel intrusions in network traffic given only a model of normal traffic • Normal web server request GET /index.html HTTP/1.0 • Code Red II worm GET /default.ida?NNNNNNNNN…

What has been done • Firewalls • Can’t block attacks on open ports (web, mail, DNS) • Signature Detection (SNORT, BRO) • Hand coded rules (search for “default.ida?NNN”) • Can’t detect new attacks • Anomaly Detection (eBayes, ADAM, SPADE) • Learn rules from normal traffic for low-level protocols (IP, TCP, ICMP) • But application protocols (HTTP, mail) are too hard to model

Learning Rules for Anomaly Detection (LERAD) • Associative mining (APRIORI, etc.) learns rules with high support and confidence for one value • LERAD learns rules with high support (n) and a small set of allowed values (r) • Any value seen at least once in training is allowed If port = 80 and word1 = “GET” then word3  {“HTTP/1.0”, “HTTP/1.1”} (r = 2)

LERAD Steps • Generate candidate rules • Remove redundant rules • Remove poorly trained rules LERAD is fast because steps 1-2 can be done on a small random sample (~100 tuples)

Step 1. Generate Candidate RulesSuggested by matching attribute values • S1 and S2 suggest: • port = 80 • if port = 80 then word1 = “GET” • if word3 = “HTTP/1.0” and word1 = “GET then port = 80 • S2 and S3 suggest no rules

Step 2. Remove Redundant RulesFavor rules with higher score = n/r Rule 1: if port = 80 then word1 = “GET” (n/r = 2/1) Rule 2: if word2 = “/index.html” then word1 = “GET” (n/r = 1/1) Rule 2 has lower score and covers no new values, so it is redundant

Step 3. Remove Poorly Trained RulesRules with violations in a validation set will probably generate false alarms r (number of allowed values) Incompletely trained rule (removed) Fully trained rule (kept) Train Validate Test

Inbound client packets (PKT) IP packet cut into 24 16-bit fields Inbound client TCP streams Date, time Source, destination IP addresses and ports Length, duration TCP flags First 8 application words Attribute Sets Anomaly score = tn/r summed over violated rules, t = time since previous violation

Experimental Evaluation • 1999 DARPA/Lincoln Laboratory Intrusion Detection Evaluation (IDEVAL) • Train on week 3 (no attacks) • Test on inside sniffer weeks 4-5 (148 simulated probes, DOS, and R2L attacks) • Top participants in 1999 detected 40-55% of attacks at 10 false alarms per day • 2002 university departmental server traffic (UNIV) • 623 hours over 10 weeks • Train and test on adjacent weeks (some unlabeled attacks in training data) • 6 known real attacks (some multiple instances)

Experimental ResultsPercent of attacks detected at 10 false alarms per day

UNIV Detection/False Alarm TradeoffPercent of attacks detected at 0 to 40 false alarms per day

Run Time Performance(750 MHz PC – Windows Me) • Preprocess 9 GB IDEVAL traffic = 7 min. • Train + test < 2 min. (all systems)

Anomalies are due to bugs and idiosyncrasies in hostile codeNo obvious way to distinguish from benign events

Contributions • LERAD differs from association mining in that the goal is to find rules for anomaly detection: a small set of allowed values • LERAD is fast because rules are generated from a small sample • Testing is fast (50-75 rules) • LERAD improves intrusion detection • Models application protocols • Detects more attacks

Thank you

Learning Rules for Anomaly Detection of Hostile Network Traffic

Learning Rules for Anomaly Detection of Hostile Network Traffic

Presentation Transcript

Anomaly Detection

Boundary Detection in Tokenizing Network Application Payload for Anomaly Detection

Learning Rules from System Call Arguments and Sequences for Anomaly Detection

Privacy-preserving collaborative network anomaly detection

Single Pass Anomaly Detection In Network

Anomaly Detection

An Anomaly-Based Approach for Intrusion Detection in Web Traffic

Anomaly Detection - Traffic Video Surveillance

Sensitivity of PCA for Traffic Anomaly Detection

Multi-Scale Analysis for Network Traffic Prediction and Anomaly Detection

Applying PCA for Traffic Anomaly Detection: Problems and Solutions

Traffic Anomaly Detection

Machine Learning for Network Anomaly Detection

Learning Rules and Clusters for Network Anomaly Detection

Locality, Network Control and Anomaly Detection

Design of Bloom Filter Array for Network Anomaly Detection

Bayesian Network Anomaly Pattern Detection for Disease Outbreaks

In/Out Traffic Proportion Based Analyses for Network Anomaly Detection

Visualizing Audio for Anomaly Detection

Learning Rules from System Call Arguments and Sequences for Anomaly Detection

Example of Anomaly Detection

Anomaly Detection Industry