Benchmarking Anomaly-Based Detection Systems

Benchmarking Anomaly-Based Detection Systems Written by: Roy A. Maxion Kymie M.C.Tan Presented by: Yi Hu

Agenda • Introduction • Benchmarking Approach • Structure in categorical data • Constructing the benchmark datasets • Experiment one • Experiment two • Conclusion & Suggestion

Introduction • Application of detection of anomaly; • Problems; • Difference in data regularities • Environment variation

Benchmarking Approach • Methodology that can provide quantitative results of running an anomaly detector on various datasets containing different structure. • Address: environment variation - structuring of the data

Structure in Categorical Data • Perfect regularity and perfect randomness(0—perfect regularity; 1—perfect randomness) • Entropy to measure the randomness

Benchmarking datasets • Training data(background) • Testing data(background+anomaly) • Anomaly data

Benchmarking datasets (cont’d) • Defining the sequence ; • Alphabet symbols.(English) • Alphabet size.(2,4,6,8,10– 5 suites) • Regularity.(0~1 at 0.1 intervals) • Sequence length.(All datasets-500,000 characters)

Defining the anomalies • Anomalies: • Foreign-symbol anomalies;(Q from A,B,C,D,E) • Foreign n-gram anomalies;(CC, not the input of A,B,C,D , but it is the bi-gram of datasets) • Rare n-gram anomalies;(Usually <0.05)

Generating the training and test data • 500,000 random numbers in table • 11 transition matrices used to produce the desired regularities. • Regularity indices between 0~1, with .1 increments

Generating the anomaly • Independent of generating the test data. • Each of the anomaly types is generated in a different way.

Injecting the anomalies into test data • The system determines the maximum number of anomalies.(Not more than .24% un-injected data.) • Select the injection intervals.

Experiment one: • Data sets: • Training dataset with rare-4-gram anomalies less than 5% occurrence; • All variables were held constant except for dataset regularity; • Total 275 benchmark datasets, 165 of which were anomaly-injected;

Experiment one: • steps: • Training the detector—11 training datasets and 55 training session are conducted; • Testing the detector—For each of the 5 alphabet sizes, the detector was run on 33 test datasets, 11 for each anomaly type. • Scoring the detection outcomes—event outcomes; ground truth;threshold; scope and presentation of results

Experiment one: • ROC analysis: • Relative operating characteristic curve; • Compare two aspects of detection systems: hits---Y axis and false alarm--- X axis

Experiment one: • Results: None of the curves overlap until they reach the 100% hit rate, demonstrating that regularity does influence detector performance. If regularity had no effect, all the ROC curve will superimpose each others.

Experiment one: • Results: The false alarm rate rises as the regularity index grows(data become more and more random) also shows regularity do affect the detection performance.

Experiment two • Natural Dataset: Y-axis: regularity index X-axis: users Data are taken from an undergraduate student computer. This diagram demonstrate clearly that regularity is a characteristic of natural data;

Conclusion • In the experiments conducted here, all variables were held constant except regularity, and it was established that a strong relationship exist between detector accuracy and regularity. • An anomaly detector cannot be evaluated on the basis of its performance on a dataset of one regularity. • Different regularity occur not only between different users and environment, but also within user sessions.

Suggestion • Overcoming this obstacle may require a mechanism to swap anomaly detectors or change the parameters of the current anomaly detector whenever regularity changes.

Benchmarking Anomaly-Based Detection Systems

Benchmarking Anomaly-Based Detection Systems

Presentation Transcript

Global Router-based Anomaly/Intrusion Detection (GRAID) Systems

Anomaly Detection

Network Payload-based Anomaly Detection and Content-based Alert Correlation

Anomaly Detection

Benchmarking Anomaly-based Detection Systems

Anomaly Detection Systems

Signature Based and Anomaly Based Network Intrusion Detection

ecs236 Winter 2007: Intrusion Detection #2: Explanation-based Anomaly Detection

Rule-Based Anomaly Detection on IP Flows

Traffic Anomaly Detection

An Algorithm for Anomaly-based Botnet Detection

An Auctioning Reputation System Based on Anomaly Detection

ELISHA: A Visual-Based Anomaly Detection System

Anomaly Detection Systems

Volume Anomaly Detection

PANACEA: AUTOMATING ATTACK CLASSIFICATION FOR ANOMALY-BASED NETWORK INTRUSION DETECTION SYSTEMS

Rule-based Anomaly Detection on IP Flows

Anomaly Detection of Web-based Attacks

ITEC 810 Entropy based anomaly detection systems

RAIDM: Router-based Anomaly/Intrusion Detection and Mitigation

Rule-Based Anomaly Detection on IP Flows

Benchmarking Anomaly-based Detection Systems