1 / 33

Applied Anomaly Based IDS

Applied Anomaly Based IDS. Craig Buchanan University of Illinois at Urbana-Champaign CS 598 MCC 4/30/13. Outline. K-Nearest Neighbor Neural Networks Support Vector Machines Lightweight Network Intrusion Detection (LNID). K-Nearest Neighbor.

kamal
Download Presentation

Applied Anomaly Based IDS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Applied Anomaly Based IDS Craig Buchanan University of Illinois at Urbana-Champaign CS 598 MCC 4/30/13

  2. Outline • K-Nearest Neighbor • Neural Networks • Support Vector Machines • Lightweight Network Intrusion Detection (LNID)

  3. K-Nearest Neighbor • “Use of K-Nearest Neighbor classifier for intrusion detection” [Liao, Computers and Security]

  4. K-nearest neighbor on text • Categorize training documents into vector space model, A • Word-by-document matrix A • Rows = words • Columns = documents • Represents weight of each word in set of documents • Build vector for test document, X • Classify X into A using K-nearest neighbor

  5. Text categorization • Create vector space model A • – weight of word i in document j • Useful variables • N – number of documents in the collection • M – number of distinct words in the collection • – frequency of word i in document j • – total number of times word i in the collection

  6. Text categorization • Frequency weighting • Term frequency – inverse document frequency (tf*idf)

  7. Text categorization • System call = “word” • Program execution = “document” • Close, execve, open, mmap, open, mmap, munmap, mmap, mmap, close, …, exit

  8. Document Classification • Distance measured by Euclidean distance • – test document • – jth training document • – word shared by and • – weight of word in • – weight of word in

  9. Anomaly detection • If X has unknown system call then abnormal • If X is the same as any Dj then normal • K-nearest neighbor • Calculate sim_avg for k-nearest neighbors • If sim_avg > threshold then normal • Else abnormal

  10. Results

  11. Results

  12. Neural Networks • Intrusion Detection with Neural Networks [Ryan, AAAI Technical Report 1997] • Learn user profiles (“prints”) to detect intrusion

  13. NNID System • Collect training data • Audit logs from each user • Train the neural network • Obtain new command distribution vector • Compare to training data • Anomaly if: • Associated with a different user • Not clearly associated with any user

  14. Collect training data • Type of data • as, awk, bc, bibtex, calendar, cat, chmod, comsat, cp, cpp, cut, cvs, date, df, diff, du, dvips, egrep, elm, emacs, …, w, wc, whereis, xbiff++, xcalc, xdvi, xhost, xterm • Type of platform • Audit trail logging • Small number of users • Not a large target

  15. Train Neural Network • Map frequency of command to nonlinear scale • 0.0 to 1.0 in 0.1 increments • 0.0 – never used • 0.1 – used once or twice • 1.0 – used > 500x • Concatenate values to 100-dimensional command distribution vector

  16. Neural Network • 3-layer backpropagation architecture Input (x100) Hidden (x30) Output (x10)

  17. Results

  18. Results • Rejected 63% random user vectors • Anomaly detection rate 96% • Correctly identified user 93% • False alarm rate 7%

  19. Support Vector Machines • Intrusion Detection Using Neural Networks and Support Vector Machines [Mukkamala, IEEE 2002]

  20. SVM IDS • Preprocess randomly selected raw TCP/IP traffic • Train SVM • 41 input features • 1 – normal • -1 – attack • Classify new traffic as normal or anomaly

  21. SVM IDS Features

  22. Results

  23. Recent Anomaly-based IDS • An efficient network intrusion detection [Chen, Computer Communications 2010] • Lightweight Network Intrusion Detection (LNID) system

  24. LNID Approach • Detect R2L and U2R • Assume attack is in first few packets • Calculate anomaly score of packets

  25. LNID System Architecture

  26. Anomaly Score • Based on Mahoney’s network IDS [21-24] • M.V. Mahoney, P.K. Chan, PHAD: packet header anomaly detection for identifying hostile network traffic, Florida Institute of Technology Technical Report CS-2001-04, 2001. • M.V. Mahoney, P.K. Chan, Learning nonstationary models of normal network traffic for detecting novel attacks, in: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002a, pp. 276-385. • M.V. Mahoney, P.K. Chan, Learning models of network traffic for detecting novel attacks, Florida Institute of Technology Technical Report CS-2002-08, 2002b. • M.V. Mahoney, Network traffic anomaly detection based on packet bytes, in: Proceedings of the 2003 ACM Symposium on Applied Computing, 2003, pp. 346-350.

  27. Anomaly Score (Mahoney) • = time elapsed since last time attribute was anomalous • = number of training or observed instances • = number of novel values of attribute

  28. Anomaly Score (revised) • = number of training or observed instances • = number of novel values of attribute

  29. Anomaly Scoring Comparison

  30. Attributes • Attribute = packet byte • 256 possible values • 48 attributes (packet bytes) • 20 bytes of IP header • 20 bytes of TCP header • 8 bytes of payload

  31. Results • Detection rate • Workload • LNID – 0.3% of traffic • NETAD – 3.16% of traffic • Lee et. al. – 100% of traffic

  32. Results • Hard detected attacks

  33. Questions or Comments

More Related