1 / 14

Statistical based IDS background introduction

Statistical based IDS background introduction. Statistical IDS background. Why do we do this project Attack introduction IDS architecture Data description Feature extraction Statistical method introduction Result analysis. Project goals. Related work

garron
Download Presentation

Statistical based IDS background introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical based IDS background introduction

  2. Statistical IDS background • Why do we do this project • Attack introduction • IDS architecture • Data description • Feature extraction • Statistical method introduction • Result analysis

  3. Project goals • Related work • Internet has various network attacks, including denial of service attacks and port scans, etc. • Overall traffic detection • Flow-level detection • Our goals • Detect both attacks at the same time • Differentiate DoS and port scans

  4. Attack introduction • TCP SYN flooding - An important form of DoS attacks - Exploit the TCP’s three-way handshake mechanism and its limitation in maintaining half-open connection - Feature: spoofed source IP - Recent reflected SYN/ACK flooding attacks

  5. Attack introduction • Port scan - horizontal scan - Vertical scan - Block scan Feature: real source IP address

  6. Statistical IDS architecture • Learning part • Detection part

  7. Data description • DARPA98 data • The first standard corpora for evaluation of network intrusion detection systems. • From the Information Systems Technology Group ( IST ) of MIT Lincoln Laboratory, • Under Defense Advanced Research Projects Agency ( DARPA ITO ) and Air Force Research Laboratory ( AFRL/SNHS ) sponsorship • Seven weeks of training data • Two weeks of detection data

  8. Data description • DARPA98 data format 897048008.080700 172.16.114.169.1024 > 195.73.151.50.25: S ACK 1055330111:1055330111(0) win 512 <mss 1460> - Time stamp: 897048008.080700 - Source IP address + port: 172.16.114.169.1024 - Destination IP address + port: 195.73.151.50.25 - TCP flag: S (maybe other : R, F, P) - ACK flag: ACK - Other part of packet header: 1055330111:1055330111(0) win 512 <mss 1460>

  9. Feature extraction • Calculate the metrics in every 5 minute traffic • Metrics • For example: SYN-SYN_ACK pair SYN-FIN + SYN-RSTactive pair traffic volume SYN packet volume …… Good Luck 

  10. Statistical method • Statistical based IDS Goals: Using statistical metrics and algorithm to differentiate the anomaly traffic from benign traffic, and to differentiate different types of attacks. - Advantage: detect unknown attacks - Disadvantage: false positive and false negative

  11. Hidden Markov Model (HMM) • HMM is a very useful statistical learning model. It has been successfully implemented in the speech recognition. - Advantage 1. analyzing sequence data (using observation probability and transition probability to represent) 2. unsurprised data training and surprised data training 3. high accuracy - Disadvantage comparatively long training time

  12. Double Gaussian model • Introduction - Two Gaussion distribution models are used to represent two classes of behaviors - Get the two probabilities of current behavior using different two-class Gaussian parameters - Compare them. The current behavior belongs to the larger probability class. • Training period - Get the two-class Gaussian parameters • Detection period - Use two-class Gaussian parameters to get probabilities and compare them

  13. Double Gaussian model • Advantage • Simple, easy to understand • Fast • Disadvantage • No sequence characteristic

  14. Result analysis • Evaluation - Important quantitative analysis: false positive + false negative - Looking at metric value, and finding the reasons - Repeating experiments

More Related