1 / 31

COD ( Cluster Onset Detection ) : Online Temporal Clustering for Outbreak Detection

COD ( Cluster Onset Detection ) : Online Temporal Clustering for Outbreak Detection. Tomas Singliar (U. Pitt.), Denver H. Dash (Intel Research, U. Pitt.) AAAI’07 (American Association for AI National Conference). Reference.

greg
Download Presentation

COD ( Cluster Onset Detection ) : Online Temporal Clustering for Outbreak Detection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. COD(Cluster Onset Detection): Online Temporal Clustering for Outbreak Detection Tomas Singliar (U. Pitt.), Denver H. Dash (Intel Research, U. Pitt.) AAAI’07 (American Association for AI National Conference)

  2. Reference • When Gossip is Good: Distributed Probabilistic Inference for Detection of Slow Network Intrusions • Denver H. Dash, etc. • AAAI’06 • COD: Online Temporal Clustering for Outbreak Detection • Tomas Singliar, Denver H. Dash • AAAI’07 Speaker: Li-Ming Chen

  3. Challenge: Slowly Propagating Attacks • Worm attacks – 2 opposite extremes: • 1. Much faster to allow rapid spread !! • 2. Much slower to prevent detection !! • Most of the existing detection techniques rely on the fact that worms are reproducing quickly • Slow propagation attacks • Difficult to detect – under the veil of normal network traffic • Still dangerous – can propagate exponentially Speaker: Li-Ming Chen

  4. Other Challenges • Global Infection: • IDSes (individual entities) can only see a partial picture of the larger network wide behavior of the worm •  require collaboration detection (AAAI’06) • Homogeneous assumption: • Detection techniques treat the population as a monolithic entity •  also note that, hosts or detectors (collaborators) are not always homogeneous (AAAI’07) Speaker: Li-Ming Chen

  5. LD LD GD LD Architecture Model • Global Detector: • aggregates messages • from LDs • Performs probabilistic • inference to determine • whether an infection • being present or not • Concept of Collaboration Detection: • LDs (designed to be weak but general classifiers) may raise false alarm at a relatively high frequency • GD can combine LDs’ weak information to infer the existence of an attack • Where to place the GDs in the network ? • Centralized/Distributed placement “Weak” host-based Local Detector Speaker: Li-Ming Chen

  6. Paper 1 • When Gossip is Good: Distributed Probabilistic Inference for Detection of Slow Network Intrusions • Denver H. Dash, etc. • AAAI’06 • COD: Online Temporal Clustering for Outbreak Detection • Tomas Singliar, Denver H. Dash • AAAI’07 Speaker: Li-Ming Chen

  7. Architecture Speaker: Li-Ming Chen

  8. A binary classifier Normal or abnormal Detect by heuristic: Counts # of new outgoing connections to unique Dst. addresses and ports Observation  see pic. In slow worm detection, set threshold to 4 (CPI) The space of LD: Inward-looking Outward-looking About the “Weak” LDs within 37 hosts LD threshold Pre-define as 4 (CPI) Propagation rate of previous worms (Blaster, Slapper, CR2, Slammer, Witty) within 5 weeks, observe 37 hosts, will have (37*5*7*24*60*60/50)= 2,237,760 obs., then compute distribution… Speaker: Li-Ming Chen

  9. 4 possible GD models • Traditional collaborative counting schemes: • PosCount • Tests whether Σ(positive counts) > threshold or not • CuSum • Detect changes in the trend of a statistic • DBN-based schemes: • CP-DBN • A simplified causal model • Models an attack as occurring uniformly across the population or not at all • E-DBN • Models the dynamics of a system that is being swept by and epidemic outbreak Speaker: Li-Ming Chen

  10. How GDs work? • Input of a GD: Lt, a binary subset of LD observations at time t • GD output: St, some measure of how likely a global anomaly is to be occurring at time t • The system of GDs makes up an ensemble !! • There are many ensemble techniques could be used • This paper only use the max function to determine whether a global alarm should be raised or not Speaker: Li-Ming Chen

  11. How GDs work? (cont’d) • Traditional collaborative counting schemes: • PosCount • Tests whether Σ(positive counts) > threshold or not • CuSum • Detect changes in the trend of a statistic • DBN-based schemes: • CP-DBN • A simplified causal model • Models an attack as occurring uniformly across the population or not at all • E-DBN • Models the dynamics of a system that is being swept by and epidemic outbreak Speaker: Li-Ming Chen

  12. CP-DBN Ai = {T, F}, attack has taken place at time i or not. Oli = {on, off}, LD l is on or off at time i. observation time T (hidden states) LD0 (observable states) total M LDs TP rate FP rate Speaker: Li-Ming Chen

  13. E-DBN (hidden states) • To model the exponential • growing trend: • T denotes observation time • At = {0, 1}, the anomaly state • at time t • Nt = {0, …, N}, # of infected hosts • S is the spreading rate • Ot = {0, …, N}, # of observed LDs that fired (observable states) state transition between unobserved state variables Speaker: Li-Ming Chen

  14. E-DBN (cont’d) • Assuming a worm attack, the growth rate in the number of infected hosts ΔNt+1 is modeled by a binomial: • The likelihood of ot detectors firing when nt hosts are infected is modeled by a binomial: • where susceptible chance of a hit Speaker: Li-Ming Chen

  15. How DBN-based GDs works? Anomaly Am at the most likely time m based on some observations from t-T to t given DBN model then, do ensemble decision making (using max function) Speaker: Li-Ming Chen

  16. Performance Evaluation • Parameters: • Spread rate S = • 1 conn. per 20 sec. • Address density = • 1/1000 (ratio of • vulnerable hosts) • LD threshold = • 4 conn. per 50 sec. • LD comm. with GD • per 10 sec. PosCount only raise a detection after the entire network is infected Desired FP rate better Speaker: Li-Ming Chen

  17. Paper 2 • When Gossip is Good: Distributed Probabilistic Inference for Detection of Slow Network Intrusions • Denver H. Dash, etc. • AAAI’06 • COD: Online Temporal Clustering for Outbreak Detection • Tomas Singliar, Denver H. Dash • AAAI’07 Speaker: Li-Ming Chen

  18. New Approach:COD (Cluster Onset Detection) • What to cluster? • Partition the population (e.g., hosts) into subgroups, • then COD tries to detect susceptible subgroups • Why clustering? • Traditional outbreak detection methods treat the population as a monolithic entity • Real populations are heterogeneous • Different subpopulations are susceptible to different degrees • Clustering can boost the signal-to-noise ratio for detection Speaker: Li-Ming Chen

  19. COD Model – detection architecture • “Weak” host-based LDs • Periodically send their status to a GD • Use the same feature and rule: • Fire whenever the number of outgoing connections exceeds 4 in a 50 second interval • Centralized GD • Collects messages and determines whether the positive local detections corroborate each other • Periodically outputs a signal that represents its belief of infection being present Speaker: Li-Ming Chen

  20. COD Model – data Time j • Dataset X • Row: Xi corresponds to a single LD i • Column: X*j corresponds to the value of a feature function in a discrete time interval j • Use temporal stratified sampling • Each time interval has a fixed position • Ex. 12am-1am, 1am-2am etc. • To account for obvious diurnal behavior in the system LD i Sum of alarms (might be FP) Speaker: Li-Ming Chen

  21. COD Model – clustering Assuming different classes generate their detections randomly at different rates and can take a fairly large range of values, Xij can be assumed as Poisson distributed Naïve Bayes clustering model NB features are positive local detection counts Xij arriving from a machine i during a time interval j F() = sum(alarms) for each machine In a time interval, a LD may fire several times Speaker: Li-Ming Chen

  22. COD Model – clustering (cont’d) • Some details: • How to determine the number m of clusters? • By using a greedy heuristic to find optimal value • Not mentioned about λkjx • At the end of each interval, • The feature value will be updated and the model is re-learned • How to cluster? • The posterior on the cluster variable M defines the assignment of local detectors into clusters: Speaker: Li-Ming Chen

  23. COD Model, example host ID Time (hr) (burn-in) • A typical example of how the hosts in the dataset get assigned into clusters. • 5 clusters (colors) & 1 day burn-in period • Clusters are rather stable and cluster membership changes rarely • At the end, most hosts have been infected Speaker: Li-Ming Chen

  24. COD Model, demonstrate daily pattern host ID Local detection count in a time interval Time (hr) • Clustering  group hosts according to the daily pattern of their local detection activity • 5 groups (two of which are composed of a single host) • reflects the applications and habits of the host and can provide better estimation for deteciton Speaker: Li-Ming Chen

  25. 4-step Cluster Interpretation • Detect “highly active” cluster (presumably infected) • Compute “average detection rate” for each host • Compute “average (local) detection rate” for each cluster and identify the most active cluster • Performing a one-sided, unbalanced-design t-test with null hypothesis • Host detection rates in the most active cluster and remainder of the population are the same ! • Comparing the outcome of the t-test to a historical histogram of values to determine if the system is in an anomalous state num. of positive detections at host i Speaker: Li-Ming Chen

  26. Experimental Evaluation • Some details in configuration: • Normal traffic trace: 5 weeks traces from 37 hosts • Inject worm traffic for testing • LDs send a message every 10 seconds • Focus on metrics: FAR, TTD (FI) • False Alarm Rate, Time To Detect, Fraction of Infection • Aim to control FAR to 1 per week • Compare the results with E-DBN (the baseline) • Traffic trace will be recycled to simulate more hosts • Observe the effects of number of cluster, network size and interval length Speaker: Li-Ming Chen

  27. COD vs. E-DBN AMOC: plot the expected time to detection (since the outbreak began) as a function of the false alarm rate COD outperforms E-DBN (FI reduce) COD/adaptive performs better but more costly to run! Speaker: Li-Ming Chen

  28. Scaling with Network Size • The performance actually improves with scaling of the system • Larger number of datapoints gives the model more information and refines the clustering Speaker: Li-Ming Chen

  29. Effect of Interval Length • Interval length affects the performance in two (opposite) ways: • More freq. re-clustering eliminates part of the “mid-interval” blind spot • Longer interval yield features with less variance. • The results show that: • Better Perf. is achieved with • longer intervals. (better • smoothing over any random • fluctuation) • Lower frequency of the • detection Algo. Invocation • gives fewer false alarms • And for slow worm, delayed • detection is okay! standard deviation (in a day) Speaker: Li-Ming Chen

  30. Conclusion • Use distribution scheme and collaborative inference to support slow worm detection • Dividing the population into subgroups according to susceptibility increase the SNR ratio and can lead to detection performance boost • Subgroups are more homogeneous in their usage and application patterns • Not require prior knowledge of the population Speaker: Li-Ming Chen

  31. My Comments • Other features on a host can reveal diurnal patterns? • Host-based LD can acquire rich information about the attack, but building a host-based distributed detection system is much harder • Clustering is a way to deal with stealthy attacks Speaker: Li-Ming Chen

More Related