1 / 12

Proposal of New Benchmark Data for Intrusion Detection Evaluation

This paper proposes a new benchmark data set for evaluating intrusion detection algorithms. The current KDD Cup 99 data set has limitations, and this proposal aims to address those issues and improve performance. The authors suggest using honeypots and sanitizing IP addresses to compare IDS alerts and honeypot traffic data.

Download Presentation

Proposal of New Benchmark Data for Intrusion Detection Evaluation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Proposal of New Benchmark Data to Evaluate Mining Algorithms for Intrusion Detection Jungsuk SONG†, Hiroki TAKAKURA‡, Yasuo OKABE‡ †Graduate School of Informatics, Kyoto Univ. ‡Academic Center for Computing and Studies, Kyoto Univ. oaktree@net.ist.i.kyoto-u.ac.jp, takakura@media.kyoto-u.ac.jp, okabe@i.kyoto-u.ac.jp

  2. Overview • Introduction • Intrusion Detection System • Intrusion Detection Evaluation Data • KDD Cup 99 Data Set • Details • Problems • Our Experimental Result • Our Proposal 23rd Asia Pacific Advanced Network Meeting

  3. Firewall Introduction • Intrusion Detection System(IDS) • combination of software and hardware that attempts to perform intrusion detection • raise the alarm when possible intrusion or suspicious patterns are observed IDS The Internet Intrusion Intrusion Attacker IDS Internal Network 23rd Asia Pacific Advanced Network Meeting

  4. Introduction • Why we need IDS? • Unknown weakness or bugs • Complex, unforeseen attacks • Firewalls, security policies • Using information detected • Recover compromised system • Understand the attack mechanism • Detect novel attacks • Defend our systems 23rd Asia Pacific Advanced Network Meeting

  5. Introduction • We need evaluation data for IDS • Performance improvement • Technical progress • Research guide… • KDD Cup 99 Data Set • Most commonly used evaluation data, but.. • Propose new benchmark data 23rd Asia Pacific Advanced Network Meeting

  6. KDD Cup 99 Data Set • Modification of DARPA 1998 data set • DARPA 1998 data set • Managed by Lincoln Lab.(under DARPA sponsorship) • Simulated nine weeks of raw TCP dump data • Attacks • 38 different attacks against Unix/Linux machines • DoS, Scan, Buffer overflow and so on. • Normal traffic • 1000’s of virtual hosts and 100’s of user automata 23rd Asia Pacific Advanced Network Meeting

  7. KDD Cup 99 Data Set • Each connection ⇒41-dimensions vector • Samples 5,tcp,smtp,SF,959,337,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,1,0.00,0.00,0.00,0.00,1.00, 0.00,0.00,144,192,0.70,0.02,0.01,0.01,0.00,0.00,0.00,0.00,normal. 0,tcp,http,SF,54540,8314,0,0,0,2,0,1,1,0,0,0,0,0,0,0,0,0,2,2,0.00,0.00,0.00,0.00,1.0 0,0.00,0.00,118,118,1.00,0.00,0.01,0.00,0.00,0.00,0.02,0.02,back. 0,tcp,http_443,S0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,114,2,1.00,1.00,0.00,0.00,0.02 ,0.06,0.00,255,2,0.01,0.07,0.00,0.00,1.00,1.00,0.00,0.00,neptune. • Numerical: 34, Categorical: 7 • Basic feature:“duration”, “protocol”… • Statistical feature:“number of connections to the same host as the current connection in the past two seconds”… • Label ⇒“normal” or “name of attacks” 23rd Asia Pacific Advanced Network Meeting

  8. KDD Cup 99 Data Set • Problems • Attacks • Can not reflect current malicious activities • Stealthy scan ⇒ short time interval, no multiple IP address scan • No attacks against Windows machines • Protocol types • Only TCP, UDP, ICMP • Can not detect attacks such as ARP Spoofing • Simplicity • Only 3 real victim hosts • 1000’s of virtual hosts and 100’s of user automata(custom software) 23rd Asia Pacific Advanced Network Meeting

  9. Our Experimental Results • PCA(Principal Components Analysis) • Technique for reducing dimensions of data set • Transform the data to a new coordinate system • What we know from PCA • The number of dimensions that are actually required to represent the original data • Accumulative Contribution Ratio • Indicate what percentage of the original data can be represented • For example • 2 dimensions ⇒ 90% : represent 90% of the original data by them 23rd Asia Pacific Advanced Network Meeting

  10. Our Experimental Results There is no guarantee their performance also will be good in real environment 23rd Asia Pacific Advanced Network Meeting

  11. Our Proposal • New benchmark data • IDS • Honeypots • Privacy problems • Sanitize IP address • Remove payload data • Goal • Comparison analysis of IDS alert and Honeypots traffic data • Detect the attacks that are missed by IDS KDD Cup 99 form Open Update every month 23rd Asia Pacific Advanced Network Meeting

  12. Thank you for your attention! 23rd Asia Pacific Advanced Network Meeting

More Related