170 likes | 348 Views
Bayesian Classifiers and Software Sensors for Intrusion Detection Systems. By: Kaushal Mittal Guide: Prof. Sunita Sarawagi. Bayesian Classifiers. Classification Supervised learning Classes known Number of classes known Statistical classifiers Based on bayes theorem
E N D
Bayesian Classifiers and Software Sensors for Intrusion Detection Systems. By: Kaushal Mittal Guide: Prof. Sunita Sarawagi
Bayesian Classifiers • Classification • Supervised learning • Classes known • Number of classes known • Statistical classifiers • Based on bayes theorem • Calculates probability of a sample belonging to a class.
Naive Bayesian classifier • Assumes attributes values to be conditionally independent given the target class. • Each training sample X is a vector of n attributes {an}. • Set of classes C { cm }. • Every new sample S is labeled to class with maximum posterior probability.
Application • Text Classification. • All words as attributes. • Assume attributes to be independent. • Use Naive bayes classifier. • M. Shavlik and J. Shavlik have used naive bayesian classifiers for intrusion detection system. • Low detection rate of 59.2%. • Proposed a Winnow based Algorithm.
Intrusion Detection System • Intrusion detection system • Anomaly detection • Misuse detection • Goals • High detection rates • Low false negative alarms • Low false positive alarms • Less CPU cycles • Quick detection rates
IDS Cont. • Problem • Detect intrusion quickly with low false alarm rate and high intrusion detection rate. • Approaches • Naive Bayes Classifiers • Winnow based Algorithm • Alternative approaches • Density based Local Outlier approach • Elman Network
IDS - Phases Data Collection Discretization Training Tuning Operational
Data Collection • The training data • system properties like CPU, memory, network connections, number of threads. • Use of Perfmon on windows, strace on linux. • Features Like • Actual value measured. • Average of Last 10 values • Average of last 100 values • Difference between current and previous values • Difference between current and average of last 10 • Difference between current and average of last 100 • Difference between average of previous 10 and previous 100
IDS - Phases Data Collection Discretization Training Tuning Operational
Discretization • Data is continuous • Discretized into 10 bins • Divide the samples into 10 bins • Selects the best distribution function • Uniform • Guassian • Exponential • Erlang
IDS - Phases Data Collection Discretization Training Tuning Operational
Training • Initialize weights for each feature • For each training sample • Calculate votes for each feature • Relative probability for value of feature • Adjust weights • In Naive bayes approach • Use exact probability of feature.
IDS - Phases Data Collection Discretization Training Tuning Operational
Tuning • Goal To calculate W, threshmini , threshfull • W – window to avoid overlapping. • Threshmin – threshold for mini alarm • Threshfull – threshold for intrusion detection. • Test set used.
Analysis • False negative alarms • System learning intruder’s behaviour. • False Positive alarms • Comparison to Naïve bayes classifier approach.
Alternatives • All suffer from false learning and false alarms. • Another approach can be • Elman networks. • Density based