300 likes | 559 Views
Single Pass Anomaly Detection. Deepak Garg 05329015 KReSIT MTech 1 IIT Bombay Guide by: Prof. Om P. Damani. What is Anomaly Detection. Detection of deviation from normal behaviour of the system Capable of detecting Novel attacks or new attacks
E N D
Single Pass Anomaly Detection Deepak Garg 05329015 KReSIT MTech 1 IIT Bombay Guide by: Prof. Om P. Damani
What is Anomaly Detection Detection of deviation from normal behaviour of the system Capable of detecting Novel attacks or new attacks Identify a activity that are different from users or a system normal behaviour To detect unauthorized attempts to access the system
Anomaly v/s Misuse Detection Anomaly Detection Refer to network behaviour, which deviate from normal network behaviour Study typical normal use and detect abnormal usage Nature of attack is unknown Misuse Detection Use previously known attacks and flag matching patterns Nature of attack is known
Causes of Anomaly DoS (Denial of Services) File Server Failure Network over load System mis-configure Implementation Bugs etc...
Different Technique for Anomaly Detection • BIRCH • ADWICE • Y-Means • PHAD
BIRCH • BIRCH algorithm is order of O(N) to form the clusters. • Important term for BIRCH • T – Size of the cluster, initially T=0 • B – Branching factor of the tree • P – Available memory Size • LS – Max clusters on leaf node • It maintain a binary tree type tree structure. • If memory size reach to P, increase the T and rebuild the tree.
Cont..... • Problems with this algorithm are • distance based measures for all calculation • Take input parameters • Some data points may be classified to be wrong • All the clusters uses same threshold
ADWICE • This technique deal with massive data • Efficient data structure. • New search index. • Dynamic nature of normal request and services. • Use clustering for training data • Where similar data point group together into cluster. Cluster using a distance function for identify closest cluster. • ADWICE store cluster feature in main memory instead of all training data points.
Cont….. • CF = (n, Ls, ss) • Centroid • Distance • D(CF, v) • D(CF1, CF2) • Merging of cluster • Total no. of Cluster M • Leaf Size (LS) • Threshold T
Algorithm: 1: procedure TRAIN(v,model) 2: closestCF = findClosestCF() 3: if thresholdRequirementOK(v,closestCF) then 4: merge(v,closestCF) 5: else 6: if size(model)<M then 7: leaf = getLeaf(v,model) 8: if spaceInLeaf(leaf) then 9: insert(newCF(v), leaf) 10: else 11: splitLeaf(leaf, newCF(v)) 12: end if 13: else 14: increaseThreshold() 15: model = rebuild(model) 16: TRAIN(v, model) 17: end if 18: end if 19: end procedure
Cont….. • In Tree Index • Non leaf node contain one CF for its child • This is not based on optimal search • Closest cluster not always found • Processing performance, accuracy suffer
Cont….. • Grid Based Index • Each subspace of d-dimensional space has a Maximum and Minimum value. • User divide each dimension d in no. of slices. • Slice width = (Max – Min)/slices for each d. • Non leaf node is mapping slice number to child node. • Leaf node have CF same as previous index. • Top down search for find a closest cluster in grid tree. • Search perform in all slices they over lapping the search width (V[i] – T, V[i] + T).
Experiments on ADWICE • Nakul have made one change in ADWICE • find closest cluster on the basis of density • we have made other change in ADWICE • change the cluster pattern in the form of BOX
Cont..... • Modify the CF • Add min value vector • Add max value vector • Modify the radius to radius vector • radius = (max - min)/2 • Modify the centre • centre = (min + max)/2
Cont..... Detection Rate = TP / (TP + FN) FP Rate = FP / (FP + TN)
Y-Means • Y-means algorithm, similar to K-means algorithm its partition the normalized data into k clusters. • K-means Algorithm • Randomly choose k-data points and make them initial cluster centre • assign each instance to its closest centre • replace each centre with the mean of its members • repeat step 2 and 3 until there is no more updation
Cont..... • The two basic problem in K-means • Number of cluster dependency • Degeneracy • Y-means algorithm partition the data points in k cluster where k in 1 to n • remove all the empty clusters • Find out optimum value of k basis on split and merge of the clusters. • Cumulative Standard Normal Distribution • 99 % instance of the cluster stay within the circle with a radius of 2.32 * SD. • split and merge depends on CSND
Packet Header Anomaly Detection • Trained on attack free traffic • Checking anomaly field of packet header. • Link Layer • Network Layer • Transport Layer • The model detect novel attacks. • Split large field • Merge small field • During training record each value of fields
Cont….. • Problems • required excessive memory • not enough training data results over fits the data • To Solve • PHAD-H1000 • PHAD-C32 • PHAD-C1000
Cont….. • For each field record the ‘ r ‘ number of different value that occur in training period. • ‘n’ is the number of training instances. • Estimated probability P = r/n (given field observation is anomalous). • In testing time fields of given packet is anomalous than calculate t/p ratio for anomalous field and sum up this ratio, this is the anomaly score to the packet (t is the time since the previous new value in the same field).
Future Work • Make the algorithm independent from the the maximum number of clusters • We will also concentrate on learning from input data set, so it will not depend on the order of input dataset • Reduce the running time complexity and false positive as much as possible
References • Source code of ADWICE. http://www.ida.liu.se/~snt/research/adwice/ • KDD Cup 1999 Dataset. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html • Nakul Aggrwal, Improving the Efficiency of Network Intrusion Detection Systems. http://www.it.iitb.ac.in/~deepak/deepak/courses/mtp/nakul.pdf • K-Mean Clustering. http://people.revoledu.com/kardi/tutorial/kMean/index.html • Adaptive Real-Time Anomaly Detection with Improved Index and Ability to Forget [Kalle Burbeck and Simin Nadjm-Tehrani] 2005 IEEE. • Network Traffic anomaly detection based on packet header [Matthew v. Mahoney ] ACM 2003.
Cont….. • T. Zhang, R. Ramakrishnan, and M. Livny. Birch an efficient data clustering method for very large databases. SIGMOD Record 1996 ACM SIGMOD International Conference on Management of Data 25(2):10314, 1996. • Anomaly Detection in IP network [ Marina Thottan and chuanyi ji ] 2003 IEEE • Technical Report • Matthew v. Mahoney, Network Traffic Anomaly Detection Based on Packet Header, ACM 2003 • Y. Guan, A. A. Ghorbani, and N. Belacel. Y-means a clustering method for intrusion detection. In Canadian Conference on AI, volume 2671 of Lecture Notes in Computer Science, pages 616617, Montreal, Canada, 2003. Springer.