290 likes | 443 Views
by Damiano Bolzoni , Sandro Etalle , Pieter H. Hartel --James O’Reilly presenting. Panacea. Intrusion Detection Systems (IDS). Signature-based IDS (SBS) Matches activity/payload to known attacks. The attacks are classified simultaneously with detection.
E N D
by DamianoBolzoni, SandroEtalle, Pieter H. Hartel --James O’Reilly presenting Panacea
Intrusion Detection Systems (IDS) Signature-based IDS (SBS) Matches activity/payload to known attacks. The attacks are classified simultaneously with detection. • Not able to recognize non-trivial variations of known attacks • Useless against new, zero-day attacks • With classification, security personnel can automate response and prioritize alerts
Intrusion Detection Systems (IDS) Anomaly-based IDS (ABS) Recognizes suspicious activity, even novel attacks, but cannot classify the activity. • High false positive rate, which is a burden to support staff • Without classification, prioritization and alert response automation are impossible
Problem Statement: Anomaly-Based IDSes appear to have more potential than signature-based IDSes because they can handle novel cases but are handicapped since they are unable to relate to security personnel anything about, e.g., a packet other than that it is “odd.”
Panacea’s Goal To accurately classify anomalies based on payload information from anomaly-based IDSes into different attack classes. The ability of an ABS to identify attacks will finally be paired with a system that can efficiently classify attacks as they happen, making ABSes far less costly in man-hours to use.
Alert Information Extractor “Boils down” an alert into a Bloom Filter representation that the classification engine can analyze. Goes through two stages: • Building the n-gram bitmap • Computing the Bloom Filter • (Only During training) It will additionally pass along classification information (labels). The more samples the Alert Classification Engine has, the more accurately it can classify alerts. As we will see, storage for all the alert payloads is impractical.
Representing the data features • There are 256length possible messages of length bytes. • Problem for classification is what features to pick – how to represent the payload without losing “meaning” • Each feature has a space and time cost, especially during training. • Too few features corresponds to a lack of resolution and the classifier’s task is likely impossible or hampered
N-grams • The information in the payload is represented using binary n-gram analysis. n , the n-gram order, represents the number of adjacent symbols that are analyzed. • The feature is the presence or absence of an n-gram in the payload and is stored in a bitmap • n-gram bitmap size is on the order of 256n
Bloom Filter • The size of 3-gram bitmap is about 2MB. A 5-gram is about 128GB. • A Bloom Filter offers an aggressive compression of the n-gram features at the risk of false positives when reading the data. • The authors state that a 10KB space would be acceptable in the 5-gram case.
Bloom Filter The binary Bloom Filter data structure is basically a vector of some length and is used for determining set membership. Insertion: hash with different hash functions (with a range of Bloom Filter vector length) and mark the positions that are hashed to. Membership: hash the value with the chosen hash functions and look up in vector- if all positions are marked then present*, otherwise absent.
Inserting into a Bloom Filter “The error rate can be decreased by increasing the number of hash transforms and the space allocated to store the table.[1]”
Collisions in the Map A collision will occur with the above probability, where l is the size of the space, k is the number of hash functions, and n is the number of insertions.
Alert Classification Engine This Engine has essentially two modes: training and classification. Training is when the classifier (SVM or RIPPER) is learning how to classify the attacks. It does this with labeled Bloom Filter data(supervised learning). Once trained, the classifier can be given unlabelled Bloom Filter data and classify it.
Accuracy and Training Set The accuracy of the classifier is dependent on training set size and, of course, its quality. Quality will be effected by the way the training data is labeled, more shortly. The training set needs to be fairly large (the larger the better but there are diminishing returns). Bolzoni et al. chose SVM and RIPPER for their accuracy but they are non-iterative learners: to update with new samples they must essentially add the samples to the original data and completely retrain. Therefore, the data must be as compact as possible without destroying the distinguishing features of the payloads.
Classifier A classifier takes input and classifies it as a member of a class A binary classifier takes input and decides essentially whether it’s a member of a class or not. Training a supervised-learning classifier involves taking labeled data and then minimizing the error on the training data using whatever sort of implementation the classifier is using.
SVM Training: It takes its sample set and plots it in a high dimensional space using a non-linear function and then divides its data with a hyperplane (a plane in a higher dimensional space). A signed distance from a plane is the metric to evaluate class membership (planes can have a positive or negative faces). Multiple classes are essentially done by adding multiple hyperplanes.
RIPPER • RIPPER is a rule-based classifier. It begins with an empty growing set and adds rules until there is no error on the growing set. • Handles multiple classes by identifying least common set and then the second-least common… • Has an optimization step to reduce rule set size
Labeling Alerts (input to the Alert information Extractor) Three methods: • Automatically: use the input from an SBS • Semi-automatic: use the SBS input and add data from an ABS with manual labeling • Manual: All alerts are manually classified
Test 1: Automatic DSa: Data Set a is 3200 automatically generated Snort alerts (SBS) triggered with vulnerability assessment tools in 14 classes. 4 classes were excluded because they had fewer than 10 samples.
Test 2: Web attacks semi-automatic semi- DSb: as Dsa but focused on web attacks alone with addition of some Milw0rm attacks. 1400 alerts all manually classified. Two most common
Live attacks DSc: Manually classified alerts from university server, no injection but alerted from ABS Poseidon and Sphinx. 100 alerts over 2 weeks. Panacea trained on DSb and is tested against the 100 ABS alerts.
Novelty: SVM vs. Ripper Extra Buffer Overflows were created by mutating known ones with the Sploit framework.
Results Bolzoni et al. present an attack payload classifier for anomaly-based intrusion detection systems. Also, there exists a framework in this paper to add other classifiers and this framework can be extended to hybrid responses (SVM early, RIPPER if sample size over some amount, SVM for high-risk cases…)
References. Questions? [1] J. Bluestein, A. El-Maazawi. “Bloom Filters- A Tutorial, Analysis, and Survey”, Technical Report CS-2002-10. Faculty of Computer Science, Dalhousie Univ., Canada.