Data Mining in Intrusion Detection

Areej Al-Bataineh Data Mining in Intrusion Detection

Outline • Data Mining Basics • Definition • Some techniques • Association Rules • Classification • Clustering • Data mining meets Intrusion Detection • Detection Approaches • Data mining use in IDS • Case Study • Behavioral Feature for Network Anomaly Detection • Conclusions Data Mining in Intrusion Detection

Data Mining – Big Picture • Knowledge Discovery in Databases (KDD) • “Process of extracting useful information from large databases” • KDD basic steps • Understanding the application domain • Data integration and selection • Data mining • Pattern Evaluation • Knowledge representation • Related Fields • Machine learning, statistics, others Data Mining in Intrusion Detection

Data Mining • “concerned with uncovering patterns, associations, changes, anomalies, and statistically significant structures and events in data” • Why Data Mining? • Understand existing data • Predict new data • Components • Representation • Decide on what model can we build. • Model is a compact summary of examples. • Learning Element • Builds a model from a set of examples • Performance Element • Applies the model to new observations Data Mining in Intrusion Detection

Data Mining Techniques • Well-known and used in Intrusion Detection • Association Rules [Descriptive] • Classification [Predictive] • Clustering [Descriptive] • Preliminary step • Raw Data  Database Table (Training set) • Columns – Attributes • Rows - Records Data Mining in Intrusion Detection

Association Rules • Motivated by market-basket analysis • Generate Rules that capture implications between attribute values • Rule Example • Lettuce & Tomato -> Salad Dressing [0.4, 0.9] • Parameters [s, c] • Support (s) % records satisfy LHS and RHS • Confidence (c) = P(satisfies RHS | satisfies LHS) • Mining Problem • “Find all association rules that have support and confidence > user-defined minimum value” Data Mining in Intrusion Detection

Classification • Predefined set of classes • Training set has Class as one of the attributes • Supervised Learning • Mining Problem • “Find a model for class attribute as a function of the values of other attributes” • Use model to predict class for new records • Classifier representation • If-then Rules • Decision Trees Data Mining in Intrusion Detection

Clustering • Given Data Set and Similarity Measure • Unsupervised Learning • Mining Problem • “Group records into clusters such that all records within a cluster are more similar to one another . And records in separate clusters are less similar another” • Similarity Measures: • Euclidean Distance if attributes are continuous. • Other Problem-specific Measures. • Clustering Methods • Partitioning • Divide data into disjoint partitions • Hierarchical • Root is complete data set, Leaves are individual records, and Intermediate layers -> partitions Data Mining in Intrusion Detection

Intrusion Detection Systems • Detection Approach • Misuse Detection • Based o known malicious patterns (signatures) • Anomaly Detection • Based on deviations from established normal patterns (profiles) • Data Source • Network-based (NIDS) • Network traffic • Host-based (HIDS) • Audit trails Data Mining in Intrusion Detection

Data Mining Usage • Signature extraction • Rule matching • Alarm data analysis • Reduce false alarms • Eliminate redundant alarms • Feature selection • Training Data cleaning Data Mining in Intrusion Detection

Case Study • Behavioral Feature for Network Anomaly Detection • Training set = normal network traffic • Feature provides semantics of the values of data • Feature selection is important • Proposed method: • Feature extraction based on protocol behavior • Many Attacks uses protocol improperly • Ping of Death • SYN Flood • Teardrop Data Mining in Intrusion Detection

Terminology • Attributes • packet header fields • Feature • Single or multiple attributes • Protocol Specifications • Policy for interaction • Define attributes and the range of values • Flow • Collection of packets exchanged between entities engaged in protocol • Client/Server flows Data Mining in Intrusion Detection

Protocol Analysis • Inter-Flow vs Intra-Flow Analysis (IVIA) • First step • Identify attributes used in partitioning traffic data into flows -> Src/Dst ports • Result: HTTP flows, DNS flows, …etc • Next Step • Examine change of attribute values • Between flows (inter-flow) • Within a flow (intra-flow) • Results Operationally Variable Attributes Flow Descriptors Operationally Invariant Data Mining in Intrusion Detection

Operationally Variable Attributes (OVA) • Uses 1999 DARPA IDS Evaluation data set • Build association rules for IP fragments using OVAs • Result - Top 8 ranking rules Data Mining in Intrusion Detection

Deriving Behavioral Features • Transform OVAs into features that capture the protocol behavior • Behavior features • Attribute observed over time/event • For an attribute observe • Entropy • Mean and standard deviations • Parentage of event within value • Percentage of events are monotonic • Step size in attribute value • Training data requirement are reduced • Normal – acceptable uses of the protocol Data Mining in Intrusion Detection

Protocol Behavior Analysis – Aggregate Model • Uses aggregate attribute values for some window of packets • Window size = 10 • Examples • TcpPerFIN = % of packets with FIN set • meanIAT = Mean inter-arrival time • 50 flows for each protocol = 250 flows • Number of packets per flow (5 – 37000) • Use decision tree classifier (C5) • FTP, SSH, Telent, SMTP, HTTP • Classifier tested on DARPA data set • FTP SSH Telnet SMTP WWW • 100% 100% 100% 82% 98% • Real Network Traffic (85% - 100%) • Kazaa • 100 % Data Mining in Intrusion Detection

Decision Tree (only small part) <=0.4 <=0.79 >0.01 >0.4 >0.79 <=0.03 >0.79 <=0.01 >546773 >73 >0.03 <=73 >546773 Data Mining in Intrusion Detection

Conclusions • Behavioral Features for Network Anomaly Detection • Attribute values cannot be used as features • Interpretation of protocol specifications • Transform attributes into behavior features • aggregation of the attribute values • Data Mining Challenges • Self-tuning data mining techniques • Pattern-finding and prior knowledge • Modeling of temporal data • Scalability • Incremental mining Data Mining in Intrusion Detection

Tools and Data Sets • Tools • Kdnuggets • Web portal http://www.kdnuggets.com • WEKA • Most comprehensive and free collection of tools • http://www.cs.waikato.ac.nz/ml/weka • Data Sets • Machine Learning Database Repository • Knowledge Discovery in Databases Archive • http://kdd.ics.uci.edu • MIT Lincolin Labs • http://www.ll.mit.edu/IST/ideval Data Mining in Intrusion Detection

References • “Applications of Data Mining in Computer Security” By Barbara and Jajodia • “Machine Learning and Data Mining for Computer Security” By Maloof • “Data Mining: Challenges and Opportunities for Data Mining During the Next Decade” By Grossman • “Data Mining: Concepts and Techniques” By Han and Kamber • SANS IDS FAQs • https://www2.sans.org/resources/idfaq/ • ACM Crossroads: IDS • http://www.acm.org/crossroads/xrds2-4/intrus.html Data Mining in Intrusion Detection

Snort Detection Engine • OLD • Represent rules as a decision tree in memory • Very inefficient • Speed is linear in term of number of rules • Rules growing fast • New • Multi-pattern search algorithm • Apply multiple rules in parallel • Set-wise methodology • Fire rule with the longest match Data Mining in Intrusion Detection

Data Mining in Intrusion Detection