1 / 13

KDD Cup ’99: Classifier Learning Predictive Model for Intrusion Detection

KDD Cup ’99: Classifier Learning Predictive Model for Intrusion Detection. Charles Elkan 1999 Conference on Knowledge Discovery and Data Mining Presented by Chris Clifton. KDD Cup Overview. Held Annually in conjunction with Knowledge Discovery and Data Mining Conference (now ACM-sponsored)

abena
Download Presentation

KDD Cup ’99: Classifier Learning Predictive Model for Intrusion Detection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. KDD Cup ’99: Classifier LearningPredictive Model for Intrusion Detection Charles Elkan 1999 Conference on Knowledge Discovery and Data Mining Presented by Chris Clifton

  2. KDD Cup Overview • Held Annually in conjunction with Knowledge Discovery and Data Mining Conference (now ACM-sponsored) • Challenge problem(s) released well before conference • Goal is to give best solution to problem • Relatively informal “contest” • Gives “standard” test for comparing techniques • Winner announced at KDD conference • Lots of recognition to winner

  3. Classifier Learning forIntrusion Detection • One of two KDD’99 challenge problems • Other was a knowledge discovery problem • Goal is to learn a classifier to define TCP/IP connections as intrusion/okay • Data: Collection of features describing TCP connection • Class: Non-attack or type of attack • Scoring: Cost per Test Sample • Wrong answers penalized based on type of “wrong”

  4. Data: TCP “connection” information • Dataset developed for 1998 DARPA Intrusion Detection Evaluation Program • Nine weeks of raw TCP dump data from simulated USAF LAN • Simulated attacks to give positive examples • Processed into 5 million training “connections”, 2 million test • Some “attributes” derived from raw data • Twenty-four attack types in training data, four classes: • DOS: denial-of-service, e.g. syn flood; • R2L: unauthorized access from a remote machine, e.g. guessing password; • U2R:  unauthorized access to local superuser (root) privileges, e.g., various ``buffer overflow'' attacks; • probing: surveillance and other probing, e.g., port scanning. • Test set includes fourteen attack types not found in training set

  5. Basic features of individual TCP connections

  6. Content features within a connection suggested by domain knowledge

  7. Traffic features computed using a two-second time window

  8. Scoring • Each prediction gets a score: • Row is correct answer • Column is prediction made • Score is average over all predictions

  9. Results • Twenty-four entries, scores:0.2331 0.2356 0.2367 0.2411 0.2414 0.2443 0.2474 0.2479 0.2523 0.2530 0.2531 0.2545 0.2552 0.2575 0.2588 0.2644 0.2684 0.2952 0.3344 0.3767 0.3854 0.3899 0.5053 0.9414 • 1-Nearest Neighbor scored 0.2523

  10. Winning Method:Bagged Boosting • Submitted by Bernhard Pfahringer, ML Group, Austrian Research Institute for AI • 50 samples from the original 5 million odd examples set • Contrary to standard bagging the sampling was slightly biased: • all of the examples of the two smallest classes U2R and R2L • 4000 PROBE, 80000 NORMAL, and 400000 DOS examples • duplicate entries in the original data set removed • Ten C5 decision trees induced from each sample • used both C5's error-cost and boosting options. • Final predictions computed from 50 single predictions of each training sample by minimizing “conditional risk” • minimizes sum of error-costs times class-probabilities • Took approximately 1 day of 200MHz 2 processor Sparc to train

  11. Confusion Matrix(Breakdown of score)

  12. Analysis of winning entry • Result comparable to 1-NN except on “rare” classes • Training sample of winner biased to rare classes • Does this give us a general principle? • Misses badly for some attack categories • True for 1-NN as well • Problem with feature set?

  13. Second and Third places(Probably not statistically significant) • Itzhak Levin, LLSoft, Inc.: Kernel Miner • Link broken? • Vladimir Miheev, Alexei Vopilov, and Ivan Shabalin, MP13, Moscow, Russia • Verbal rules constructed by an expert • First echelon of voting decision trees • Second echelon of voting decision trees • Steps sequentially • Branch to the next step occurs whenever the current one has failed to recognize the connection • Trees constructed using their own (previously developed) tree learning algorithm

More Related