1 / 19

Network Intrusion Detection Using Random Forests

Network Intrusion Detection Using Random Forests. Jiong Zhang Mohammad Zulkernine School of Computing Queen's University Kingston, Ontario, Canada. Outline. Motivation Intrusion detection system Data mining meets intrusion detection Proposed architecture Challenges and solutions

tiara
Download Presentation

Network Intrusion Detection Using Random Forests

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Network Intrusion Detection Using Random Forests Jiong Zhang Mohammad Zulkernine School of Computing Queen's University Kingston, Ontario, Canada

  2. Outline • Motivation • Intrusion detection system • Data mining meets intrusion detection • Proposed architecture • Challenges and solutions • Experimental results • Conclusion and future work Jiong Zhang and Mohammad Zulkernine

  3. Motivation • Intrusion Prevention System (firewall) can not prevent all attacks. Intruder Victim Intruder Firewall Internet Jiong Zhang and Mohammad Zulkernine

  4. Motivation (contd.) Statistical data for intrusions • Total losses of 2004 (reported): $141,496,560. • Source: FBI survey for Year 2004 • 50% of security breaches are undetected. • Source: FBI Statistics for Year 2000 Jiong Zhang and Mohammad Zulkernine

  5. Intrusion Detection Techniques • Misuse Detection • Extracts patterns of known intrusions • Cannot detect novel intrusions • Has low false positive rate • Anomaly Detection • Builds profiles for normal activities • Uses the deviations from the profiles to detect attacks • Can detect unknown attacks • Has high false positive rate Jiong Zhang and Mohammad Zulkernine

  6. Network Intrusion Detection System (NIDS) • Monitors network traffic to detect intrusions • Monitors more targets on a network • Detects some attacks that host-based systems miss • Does not affect network operations Jiong Zhang and Mohammad Zulkernine

  7. Current NIDS Many current NIDSs (like snort) : • Rule-based • Unable to detect novel attacks • High maintenance cost Jiong Zhang and Mohammad Zulkernine

  8. Rule Based vs. Data Mining • Rule based systems • Data mining based systems Intrusion Data Security Experts Rules Labeled Data Data Mining Engine Patterns Jiong Zhang and Mohammad Zulkernine

  9. Data Mining Meets Intrusion Detection • Extract patterns of intrusions for misuse detection • Build profiles of normal activities for anomaly detection • Build classifiers to detect attacks • Some IDSs have successfully applied data mining techniques in intrusion detection Jiong Zhang and Mohammad Zulkernine

  10. Proposed Architecture Networks Database (On line) Alarms Packets Audited data Sensors On-line Pre- Processors Detector Alarmer Feature vectors Patterns On line Off line Training data Feature vectors Data Set Off-line Pre- processor Pattern Builder Database (Off line) Architecture of the proposed NIDS Jiong Zhang and Mohammad Zulkernine

  11. Random Forests • Unsurpassable in accuracy among the current data mining algorithms • Runs efficiently on large data set with many features • Gives the estimates of what features are important • No nominal data problem • No over-fitting Jiong Zhang and Mohammad Zulkernine

  12. Imbalanced Intrusion • Problems • Higher error rate for minority intrusions • Some minority intrusions are more dangerous • Need to improve the performance for the minority intrusions • Proposed Solution • Down-sample the majority intrusions and over-sample the minority intrusions Jiong Zhang and Mohammad Zulkernine

  13. Feature Selection • Essential for improving detection rate • Reduces the computational cost • Many NIDSs select features by intuition or the domain knowledge Jiong Zhang and Mohammad Zulkernine

  14. Feature Selection over the KDD’99 Dataset • Calculate variable importance using random forests. • Select the 38 most important features in detection. Jiong Zhang and Mohammad Zulkernine

  15. Some Features • The two most important features • Feature 3. service type, such as http, telnet, and ftp • Feature 23. count, # connections to the same host as the current one during past two seconds • The three least important features • Feature 7. land, 1 if connection is from/to the same host/port; 0 otherwise • Feature 20. num_outbound_cmds, # of outbound commands in an ftp session • Feature 21. is_hot_login, 1 if the login belongs to the “hot” list; 0 otherwise Jiong Zhang and Mohammad Zulkernine

  16. Parameter Optimization for Random Forests • Optimize the parameter Mtry of random forests to improve detection rate. • Choose 15 as the optimal value, which reaches the minimum of the oob error rate. Jiong Zhang and Mohammad Zulkernine

  17. Performance Comparison on the KDD’99 Dataset • Our approach provides lower overall error rate and cost compared to the best KDD’99 result. • Feature selection can improve the performance of intrusion detection. Jiong Zhang and Mohammad Zulkernine

  18. Conclusion and Future Work • Random forests algorithm can help improve detection performance and select features. • Sampling techniques can reduce the time to build patterns and increase the detection rate of minority intrusions. • In future, we will focus on anomaly detection and a multiple classifier architecture. Jiong Zhang and Mohammad Zulkernine

  19. Jiong Zhang and Mohammad Zulkernine

More Related