150 likes | 238 Views
Anomaly intrusion detection by clustering transactional audit streams in a host computer. Nam Hun Park , Sang Hyun Oh, Won Suk Lee InS , Vol.180, 2010, pp. 2375–2389. Presenter : Wei- Shen Tai 20 10 / 4/14. Outline. Introduction Related works Clustering transactional user activities
E N D
Anomaly intrusion detection by clustering transactional audit streams in a host computer Nam Hun Park , Sang Hyun Oh, Won Suk Lee InS, Vol.180, 2010, pp. 2375–2389. Presenter : Wei-Shen Tai 2010/4/14
Outline • Introduction • Related works • Clustering transactional user activities • Anomaly detection of clusters on transactional features • Experimental results • Conclusion • Comments
Motivation • Most anomaly intrusion detection approaches • Only the static behavior of a user in the audit data set. • For a real-time environment, the current activities of a user should be processed as soon as possible to be reflected for the anomaly detection. ? ? ?
Objective • A grid-based clustering algorithm for an audit data stream • Detects anomaly intrusions on continuous transactional audit streams based on partitioned grids.
A transactional data stream and initial cell • A transaction in an audit data stream • Contains a set of activities (logs) performed in sequence by a user. • The number of data values in a transaction is • The number of transactions in the current data stream is • Initial cell g • For each feature, the range of an initial cell g becomes the united intervals of
Grid-based clusters • Distribution statistics of an initial cell g t = 5, 100 transactions in this Dt, the support of g is ct = 20 t = 5, 250 transactions in Dt, t_avg ={50, 20, 30, 50, 20….} t = 5, 250 transactions in Dt, Tg is the number of data in this range of T
Split of initial cells • When a new data element et is generated, distribution statistics of the cell g were updated • When the current support (ct) of the cell g is greater than split support threshold , two intermediate cells g11 and g2 2 are created as the children of the initial cell. • Those children of initial cells will be split under their support is less than split support threshold.
Dividing grid-cells on distributions • To partition a dense grid-cell • μ-partition, σ-partition, and hybrid-partition • hybrid-partition: If dev > deve, pick μ-partition. Otherwise, pick σ-partition
Cluster properties • A cluster C containing a set of v adjacent dense unit grid-cells
Decaying weights on activities • Forgetting factor • It is employed to diminish the effects of old patterns. • A decay-base b determines the amount of weight reduction per decay-unit and is greater than 1. A decay-base-life w is defined as the number of decay-units. • A new transaction is generated in the current data stream
Profiling method • Internal summary contains the properties of each cluster. • External summary represents the statistics of noise data objects, i.e., the data objects outside all clusters. s f
Anomaly detection method • Internal distance, ratio • Normalizing factor γ is a user-defined parameter that can control the effect of an internal difference. • External distance, ratio
Conclusion • An anomaly detection method based on a grid-based clustering algorithm • For each feature, clusters can be effectively found without physically maintaining any data elements of an audit data stream. • A user’s new activities are continuously reflected to the ongoing clustering results and the profile of the user at the same time.
Comments • Advantage • This proposed method provides a solution for anomaly intrusion detection. • It seems plausible to apply this method to detect anomaly activities in different fields. • Drawback • Cold start problem will occur under no manual supervision. That is, the system cannot distinguish normal clusters from abnormal clusters in the beginning. • Application • Dynamic data clustering for continuous data stream.