210 likes | 301 Views
Paul Dokas, Levent Ertoz, Vipin Kumar, Aleksandar Lazarevic, Jaideep ZSrivastava, Pang-Ning Tan Computer Science Department University of Minnesota. CS685 Presentation. Data Mining for Network Intrusion Detection. Presented By: Song.Yuan@uky.edu. CS685 Presentation. Outlines Motivation
E N D
Paul Dokas, Levent Ertoz, Vipin Kumar, Aleksandar Lazarevic, Jaideep ZSrivastava, Pang-Ning Tan Computer Science Department University of Minnesota CS685Presentation Data Mining for Network Intrusion Detection Presented By: Song.Yuan@uky.edu
CS685Presentation • Outlines • Motivation • Related Work • Detection Models and Approaches • Experimental Evaluation • Conclusion
CS685Presentation • Motivation • Organizations are becoming increasingly vulnerable to potential cyber threats, e.g., network intrusions. cyber incidents reported to CERT/CC
CS685Presentation • Motivation (cont.) • Intrusion Detection System (IDS) • collect signatures of known attacks • input attack signatures into IDS signature databases • extract features from various audit streams • compare these features with attacks signatures • raise the alarm when possible intrusion happens • Limitations of traditional signature-based methods • manual update of signature database • inability to detect emerging cyber threats
CS685Presentation • Motivation (cont.) • Why data mining? • large volumes of network data • different data mining techniques • clustering, classification
CS685Presentation • Related Work • Data mining based intrusion detection techniques • anomaly detection • Build models of normal data • Detect any deviation from normal data • Flag deviation as suspect • Identify new types of intrusions as deviation from normal behavior • misuse detection • Label all instances in the data set (“normal” or “intrusion” ) • Run learning algorithms over the labeled data to generate classification rules • Automatically retrain intrusion detection models on different input data
CS685Presentation • Related Work --- misuse detection • Classification Model • Bayesian classifier • Decision tree • Association rule • Support vector machine • Learning from rare class
CS685Presentation • Related Work --- anomaly detection • Anomaly Detection Model • Association rule • Neural network • Unsupervised SVM • Outlier detection
CS685Presentation • Detection Models • misuse detection • rare class prediction model • known intrusions and their variations • anomaly detection • outlier detection model • novel attacks whose nature is unknown •
CS685Presentation • Learning from Rare Class • Problem: classification model for dataset with skewed class distribution ? • intrusion class << normal class • Mining needle in a haystack
CS685Presentation • Learning from Rare Class (cont.) • Novel classification algorithms • PN-rule • P-rule most of intrusive examples • N-rule eliminating false alarms • SMOTEBoost • SMOTE (Synthetic Minority Over-sampling TEchnique) • Boosting
CS685Presentation • Anomaly Detection • Novel attacks/intrusions • deviation from normal behavior • Outlier detection algorithm • Nearest neighbor approach • Distance based approach • Density based approach • Unsupervised support vector machines
CS685Presentation • Anomaly Detection • Density based approach (LOF)
CS685Presentation • Anomaly Detection • Identify normal behavior • Construct useful set of feature • Define similarity function • Flag deviation as suspect
CS685Presentation • Experimental Evaluation • Public data set • DARPA 1998 Intrusion Detection Evaluation Data Set • prepared and managed by MIT Lincoln Lab • training data and test data • KDD Cup 1999 Data • the extension of DARPA’98 • training data and test data • Real network data • Network data from University of Minnesota
CS685Presentation • Experimental Evaluation---feature construction • Purpose: • more informative data set from public data set • Method: • connection records • label connection records • ‘normal‘ or ‘intrusion‘ • features for each connection record • # of {packets, bytes}, {ACK, Re-Tx} packets, SYN/FIN, … • time-based features ( DoS attacks ) • connection-based features ( PROBING attacks )
CS685Presentation ExperimentalEvaluation--- single connection attacks ROC curves for single connection attacks
CS685Presentation Experimental Evaluation --- bursty attacks ROC curves for bursty attacks
CS685Presentation • Experimental Evaluation --- real network data • Why? • Limitations of DARPA’98 data set • How? • Detect network intrusion in the live network traffic • Result? • Successfully identify some novel intrusions • (top ranked outliers)
CS685Presentation • Conclusion • promising intrusion detection models • performance of algorithm (on-line detection) • new classification and anomaly detection algorithms
CS685Presentation Thanks! Questions?