Single Pass Anomaly Detection

Single Pass Anomaly Detection Deepak Garg 05329015 KReSIT MTech 1 IIT Bombay Guide by: Prof. Om P. Damani

What is Anomaly Detection Detection of deviation from normal behaviour of the system Capable of detecting Novel attacks or new attacks Identify a activity that are different from users or a system normal behaviour To detect unauthorized attempts to access the system

Anomaly v/s Misuse Detection Anomaly Detection Refer to network behaviour, which deviate from normal network behaviour Study typical normal use and detect abnormal usage Nature of attack is unknown Misuse Detection Use previously known attacks and flag matching patterns Nature of attack is known

Causes of Anomaly DoS (Denial of Services) File Server Failure Network over load System mis-configure Implementation Bugs etc...

Different Technique for Anomaly Detection • BIRCH • ADWICE • Y-Means • PHAD

BIRCH • BIRCH algorithm is order of O(N) to form the clusters. • Important term for BIRCH • T – Size of the cluster, initially T=0 • B – Branching factor of the tree • P – Available memory Size • LS – Max clusters on leaf node • It maintain a binary tree type tree structure. • If memory size reach to P, increase the T and rebuild the tree.

Cont..... • Problems with this algorithm are • distance based measures for all calculation • Take input parameters • Some data points may be classified to be wrong • All the clusters uses same threshold

ADWICE • This technique deal with massive data • Efficient data structure. • New search index. • Dynamic nature of normal request and services. • Use clustering for training data • Where similar data point group together into cluster. Cluster using a distance function for identify closest cluster. • ADWICE store cluster feature in main memory instead of all training data points.

Cont….. • CF = (n, Ls, ss) • Centroid • Distance • D(CF, v) • D(CF1, CF2) • Merging of cluster • Total no. of Cluster M • Leaf Size (LS) • Threshold T

Algorithm: 1: procedure TRAIN(v,model) 2: closestCF = findClosestCF() 3: if thresholdRequirementOK(v,closestCF) then 4: merge(v,closestCF) 5: else 6: if size(model)<M then 7: leaf = getLeaf(v,model) 8: if spaceInLeaf(leaf) then 9: insert(newCF(v), leaf) 10: else 11: splitLeaf(leaf, newCF(v)) 12: end if 13: else 14: increaseThreshold() 15: model = rebuild(model) 16: TRAIN(v, model) 17: end if 18: end if 19: end procedure

Cont….. • In Tree Index • Non leaf node contain one CF for its child • This is not based on optimal search • Closest cluster not always found • Processing performance, accuracy suffer

Cont…..

Cont….. • Grid Based Index • Each subspace of d-dimensional space has a Maximum and Minimum value. • User divide each dimension d in no. of slices. • Slice width = (Max – Min)/slices for each d. • Non leaf node is mapping slice number to child node. • Leaf node have CF same as previous index. • Top down search for find a closest cluster in grid tree. • Search perform in all slices they over lapping the search width (V[i] – T, V[i] + T).

Example

Experiments on ADWICE • Nakul have made one change in ADWICE • find closest cluster on the basis of density • we have made other change in ADWICE • change the cluster pattern in the form of BOX

Cont..... • Modify the CF • Add min value vector • Add max value vector • Modify the radius to radius vector • radius = (max - min)/2 • Modify the centre • centre = (min + max)/2

Cont.....

Cont..... Detection Rate = TP / (TP + FN) FP Rate = FP / (FP + TN)

Cont.....

Y-Means • Y-means algorithm, similar to K-means algorithm its partition the normalized data into k clusters. • K-means Algorithm • Randomly choose k-data points and make them initial cluster centre • assign each instance to its closest centre • replace each centre with the mean of its members • repeat step 2 and 3 until there is no more updation

Cont..... • The two basic problem in K-means • Number of cluster dependency • Degeneracy • Y-means algorithm partition the data points in k cluster where k in 1 to n • remove all the empty clusters • Find out optimum value of k basis on split and merge of the clusters. • Cumulative Standard Normal Distribution • 99 % instance of the cluster stay within the circle with a radius of 2.32 * SD. • split and merge depends on CSND

Cont.....

Packet Header Anomaly Detection • Trained on attack free traffic • Checking anomaly field of packet header. • Link Layer • Network Layer • Transport Layer • The model detect novel attacks. • Split large field • Merge small field • During training record each value of fields

Cont….. • Problems • required excessive memory • not enough training data results over fits the data • To Solve • PHAD-H1000 • PHAD-C32 • PHAD-C1000

Cont….. • For each field record the ‘ r ‘ number of different value that occur in training period. • ‘n’ is the number of training instances. • Estimated probability P = r/n (given field observation is anomalous). • In testing time fields of given packet is anomalous than calculate t/p ratio for anomalous field and sum up this ratio, this is the anomaly score to the packet (t is the time since the previous new value in the same field).

Future Work • Make the algorithm independent from the the maximum number of clusters • We will also concentrate on learning from input data set, so it will not depend on the order of input dataset • Reduce the running time complexity and false positive as much as possible

References • Source code of ADWICE. http://www.ida.liu.se/~snt/research/adwice/ • KDD Cup 1999 Dataset. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html • Nakul Aggrwal, Improving the Efficiency of Network Intrusion Detection Systems. http://www.it.iitb.ac.in/~deepak/deepak/courses/mtp/nakul.pdf • K-Mean Clustering. http://people.revoledu.com/kardi/tutorial/kMean/index.html • Adaptive Real-Time Anomaly Detection with Improved Index and Ability to Forget [Kalle Burbeck and Simin Nadjm-Tehrani] 2005 IEEE. • Network Traffic anomaly detection based on packet header [Matthew v. Mahoney ] ACM 2003.

Cont….. • T. Zhang, R. Ramakrishnan, and M. Livny. Birch an efficient data clustering method for very large databases. SIGMOD Record 1996 ACM SIGMOD International Conference on Management of Data 25(2):10314, 1996. • Anomaly Detection in IP network [ Marina Thottan and chuanyi ji ] 2003 IEEE • Technical Report • Matthew v. Mahoney, Network Traffic Anomaly Detection Based on Packet Header, ACM 2003 • Y. Guan, A. A. Ghorbani, and N. Belacel. Y-means a clustering method for intrusion detection. In Canadian Conference on AI, volume 2671 of Lecture Notes in Computer Science, pages 616617, Montreal, Canada, 2003. Springer.

Thank You

Single Pass Anomaly Detection

Single Pass Anomaly Detection

Presentation Transcript

Data Mining Anomaly Detection

Anomaly Detection

Data Mining Anomaly Detection

Single Pass Anomaly Detection In Network

Population-Wide Anomaly Detection

Anomaly Detection

Anomaly Detection and Mitigation

Anomaly Detection Systems

Misuse and Anomaly Detection

Traffic Anomaly Detection

Anomaly Detection Using “Normal” Data

Anomaly Detection Systems

Volume Anomaly Detection

Anomaly Detection: A Tutorial

Global Anomaly Detection Market

Global Anomaly Detection Market

Anomaly Detection in Communication Networks

Visualizing Audio for Anomaly Detection

Example of Anomaly Detection

Benchmarking Anomaly-based Detection Systems

Anomaly Detection Industry