1 / 31

Weighting versus Pruning in Rule Validation for Detecting Network and Host Anomalies

Explore the impact of rule weighting and pruning on detecting network and host anomalies using machine learning techniques. The study compares the effectiveness of these strategies for improved rule quality and anomaly detection.

dawnv
Download Presentation

Weighting versus Pruning in Rule Validation for Detecting Network and Host Anomalies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Weighting versus Pruning in Rule Validation for Detecting Network and Host Anomalies • Gaurav Tandon • (joint work with Philip K. Chan) • Center for Computation and Intelligence • Department of Computer Sciences • Florida Institute of Technology • Melbourne, Florida 32901. • gtandon@fit.edu

  2. Outline • Intrusion detection systems taxonomy • Aspects of rule quality • Rule pruning and weighting • Weight update methods • Experimental evaluation and results • Summary Gaurav Tandon

  3. Intrusion Detection Systems • Signature Detection • Model “known attacks” • Advantage: Accuracy • Disadvantage: Unable to detect novel attacks • Anomaly Detection • Model “normal behavior” • Advantage: Detecting new attacks • Disadvantage: False alarms • Machine learning for Anomaly Detection • training from normal data only • “one-class” learning Gaurav Tandon

  4. Learning Rules for Anomaly Detection (LERAD) • LERAD (Mahoney and Chan, ICDM 2003) • A, B, and X - attributes • a, b, x1, x2- values for corresponding attributes • Anomaly Score • Abnormal events: Degree of anomaly • Normal events: Zero Gaurav Tandon

  5. Aspects of Rule Quality • Predictiveness • Measure of accuracy of consequent given antecedent • P (consequent | antecedent) • Examples: RIPPER, C4.5 rules • Belief • Measure of trust for entire rule • Example: Weights in ensemble methods, boosting Gaurav Tandon

  6. Predictiveness vs. Belief for LERAD rule • Predictiveness: p • P (not consequent | antecedent) • Belief: w • Weight for the entire rule Gaurav Tandon

  7. Motivation and Problem Statement • Rule Pruning • Reduce overfitting • Rule Weighting • Use “belief” to combine predictions • Previous studies: • Pruning vs. no-pruning • Weighting vs. non-weighting • Current work: • Pruning vs. weighting Gaurav Tandon

  8. Overview of LERAD • Generate candidate rules from a small training sample • Perform coverage test to minimize the rule set • Update rules with the entire training set • Validate rules on a separate validation set Gaurav Tandon

  9. Anomaly score • p: probability of observing a value not in the consequent • r: cardinality of the set {x1, x2, …} in the consequent • n: number of instances that satisfy the antecedent • (Witten and Bell, 1991) • Anomaly score = 1/p Gaurav Tandon

  10. Revisit Validation Step • Generate candidate rules from a small training sample • Perform coverage test to minimize the rule set • Update rules with the entire training set • Validate rules on a separate validation set Gaurav Tandon

  11. Rule Pruning Rule Set r1 Conform Validate r2 Violate r3 r4 Validation Data (normal) Training Data (normal) r5 r6 r7 • Conformed rules kept • Violated rules pruned (False Alarm) r8 r9 r10 Gaurav Tandon

  12. Rule Pruning • Given a rule and a data instance, three cases apply: • rule conformed • rule violated • rule inapplicable – no changes Gaurav Tandon

  13. Case 1 - Rule Conformed (Rule Pruning) • Rule: • Data instance: <SrcIP=128.1.2.3, DestIP=128.4.5.6, DestPort=80> • Updated rule: • Consequent - no changes • p = 3/101 Gaurav Tandon

  14. Case 2 - Rule Violated (Rule Pruning) • Rule: • Data instance: < SrcIP=128.1.2.3, DestIP=128.4.5.6, DestPort=23 > • Updated rule: • Any rule violation is a false alarm - remove rule Gaurav Tandon

  15. LERAD Rule Generation • Generate candidate rules from a small training sample • Perform coverage test to minimize the rule set • Update rules with the entire training set • Validate rules on a separate validation set Gaurav Tandon

  16. Coverage and Rule Pruning • Minimal set of rules to cover the training set • Each rule has large coverage on training set • Pruning reduces coverage • Potentially miss detections Gaurav Tandon

  17. LERAD Rule Generation • Generate candidate rules from a small training sample • Perform coverage test to minimize the rule set • Update rules with the entire training set • Validate rules on a separate validation set Gaurav Tandon

  18. Rule Weighting Weighted Rule Set r1,w1 Conform Validate r2,w2 Violate r3,w3 r4,w4 Validation Data (normal) Training Data (normal) r5,w5 r6,w6 r7,w7 • Weight increase for conformed rules • Weight decrease for violated rules(False Alarm) r8,w8 r9,w9 r10,w10 Gaurav Tandon

  19. Case 1 - Rule Conformed (Rule Weighting) • Rule: • Data instance: <SrcIP=128.1.2.3, DestIP=128.4.5.6, DestPort=80> • Updated rule: • Consequent - no change • p = 3/101 • w increase = w' Gaurav Tandon

  20. Case 2 - Rule Violated (Rule Weighting) • Rule: • Data instance: <SrcIP=128.1.2.3, DestIP=128.4.5.6, DestPort=23> • Updated rule: • Consequent: add DestPort value 23 • p = 4/101 • w decrease = w' Gaurav Tandon

  21. Anomaly Score • Rule Pruning: • rule predictiveness • Rule Weighting: • rule predictiveness • rule belief where t – time elapsed since last anomaly Gaurav Tandon

  22. Weighting Method 1: Winnow-specialist • Rule k • Decrease weight: • Increase weight: where • 2 parameters • Sum of rewards might not be equal to sum of penalties Gaurav Tandon

  23. Weighting Method 2: Equal Reward Apportioning • Weight sum does not change • Total reward = Total Penalty (TP) • Violated rules: • Confirmed rules: • where Nc is the number of conformed rules • 1 parameter Gaurav Tandon

  24. Weighting Method 3: Weight of Evidence where • Subset of pruned rules kept • Only rules with negative weight of evidence removed • 0 parameters Gaurav Tandon

  25. Empirical Evaluation Experimental Data • Network • IDEVAL-TCP, IDEVAL-PKT, IDEVAL-COMB, UNIV-TCP, UNIV-PKT, UNIV-COMB • Host • IDEVAL-BSM, UNM, FIT-UTK Evaluation Criteria • AUC: Area under ROC curve • Up to 0.1% and 1% False Alarm (FA) rate Gaurav Tandon

  26. AUC% (0.1% FA)[Random detector AUC= 0.005%] Gaurav Tandon

  27. AUC% (1% FA)[Random detector AUC= 0.5%] Gaurav Tandon

  28. Analysis of new attack(s) detected by rule weighting • New detections due to higher anomaly scores • Increased weights of conformed rules (kept by both pruning and weighting) • 2 new detections 2) Decreased weights of violated rules (removed by pruning but retained by weighting) • 18 new detections Gaurav Tandon

  29. Overhead • Training time • Avg. increase: 2.9% • Testing (detection) time • Avg. increase: 0.8% • Number of rules in rule set • Avg. increase: 2.9% Gaurav Tandon

  30. Summary • Proposed weights representing rule belief for anomaly detection • Presented three weighting schemes • Compared Pruning and Weighting LERAD variants on various network and host data sets • Weighting scheme detects more attacks at lower false alarm rates than Pruning • Most new attacks detected by violated rules discarded by Pruning • Weighting has higher memory and time requirements than Pruning, still feasible for online system Gaurav Tandon

  31. Thank You Poster # 2 tonight Questions/Comments? Gaurav Tandon

More Related