Statistical based IDS background introduction

Statistical based IDS background introduction

Statistical IDS background • Why do we do this project • Attack introduction • IDS architecture • Data description • Feature extraction • Statistical method introduction • Result analysis

Project goals • Related work • Internet has various network attacks, including denial of service attacks and port scans, etc. • Overall traffic detection • Flow-level detection • Our goals • Detect both attacks at the same time • Differentiate DoS and port scans

Attack introduction • TCP SYN flooding - An important form of DoS attacks - Exploit the TCP’s three-way handshake mechanism and its limitation in maintaining half-open connection - Feature: spoofed source IP - Recent reflected SYN/ACK flooding attacks

Attack introduction • Port scan - horizontal scan - Vertical scan - Block scan Feature: real source IP address

Statistical IDS architecture • Learning part • Detection part

Data description • DARPA98 data • The first standard corpora for evaluation of network intrusion detection systems. • From the Information Systems Technology Group ( IST ) of MIT Lincoln Laboratory, • Under Defense Advanced Research Projects Agency ( DARPA ITO ) and Air Force Research Laboratory ( AFRL/SNHS ) sponsorship • Seven weeks of training data • Two weeks of detection data

Data description • DARPA98 data format 897048008.080700 172.16.114.169.1024 > 195.73.151.50.25: S ACK 1055330111:1055330111(0) win 512 <mss 1460> - Time stamp: 897048008.080700 - Source IP address + port: 172.16.114.169.1024 - Destination IP address + port: 195.73.151.50.25 - TCP flag: S (maybe other : R, F, P) - ACK flag: ACK - Other part of packet header: 1055330111:1055330111(0) win 512 <mss 1460>

Feature extraction • Calculate the metrics in every 5 minute traffic • Metrics • For example: SYN-SYN_ACK pair SYN-FIN + SYN-RSTactive pair traffic volume SYN packet volume …… Good Luck 

Statistical method • Statistical based IDS Goals: Using statistical metrics and algorithm to differentiate the anomaly traffic from benign traffic, and to differentiate different types of attacks. - Advantage: detect unknown attacks - Disadvantage: false positive and false negative

Hidden Markov Model (HMM) • HMM is a very useful statistical learning model. It has been successfully implemented in the speech recognition. - Advantage 1. analyzing sequence data (using observation probability and transition probability to represent) 2. unsurprised data training and surprised data training 3. high accuracy - Disadvantage comparatively long training time

Double Gaussian model • Introduction - Two Gaussion distribution models are used to represent two classes of behaviors - Get the two probabilities of current behavior using different two-class Gaussian parameters - Compare them. The current behavior belongs to the larger probability class. • Training period - Get the two-class Gaussian parameters • Detection period - Use two-class Gaussian parameters to get probabilities and compare them

Double Gaussian model • Advantage • Simple, easy to understand • Fast • Disadvantage • No sequence characteristic

Result analysis • Evaluation - Important quantitative analysis: false positive + false negative - Looking at metric value, and finding the reasons - Repeating experiments

Statistical based IDS background introduction

Statistical based IDS background introduction

Presentation Transcript

Background Introduction

Background Introduction

HOST BASED IDS (HIDS)

INTRODUCTION/BACKGROUND

Applied Anomaly Based IDS

Background Introduction

Introduction/ Background

Background/Introduction

Introduction/Background

FSM based Algorithms for IDS Design:

INTRODUCTION : BACKGROUND

An Introduction to X-ray Spectral Fitting II: Statistical background

Background Introduction

Introduction/Background

OpenFlow based firewall with embedded IDS

Introduction to IPS/IDS

Introduction/Background

Introduction/Background

Background, Introduction

Introduction to Bro-ids

Background/Introduction