350 likes | 494 Views
Profiling Internet Backbone Traffic: Behavior Models and Applications. Kuai Xu, Zhi-Li Zhang (Univ. of Minnesota) Supratik Bhattacharyya (Sprint ATL) SIGCOMM 2005. Outline. Introduction Background: Entropy & Relative Uncertainty (RU) Significant Clusters Extraction
E N D
Profiling Internet Backbone Traffic: Behavior Models and Applications Kuai Xu, Zhi-Li Zhang (Univ. of Minnesota) Supratik Bhattacharyya (Sprint ATL) SIGCOMM 2005
Outline • Introduction • Background: Entropy & Relative Uncertainty (RU) • Significant Clusters Extraction • Cluster Behavioral Classification • Structural Models (Cluster Behavioral Interpretation) • Applications • Conclusion & Comments Speaker: Li-Ming Chen
Introduction • Why profile traffic? • Changes in Internet traffic dynamics • Increase in unwanted traffic • Wide diversity of end-hosts, applications and services • New services on traditional ports • Traditional services on non-standard ports • Existing tools • Port-based, volume-based, content-based • Need better techniques to discover behavior patterns (especially interesting behavior..) Speaker: Li-Ming Chen
Problem Settings • Problems • How to characterize communication patterns? • Are these patterns meaningful? • How to automatically discover such patterns? • Challenges • Vast amount of traffic data • Large number of end hosts • Diverse applications • A more specific problem settings • Use one-way traffic data from single backbone link • Use only packet header information • No assumption of normal (or anomalous) behavior Speaker: Li-Ming Chen
Objectives • Develop a general methodology for profiling Internet backbone traffic • Automatically discovers significant behaviors of interest from massive traffic data • Automatically interpret these behaviors • Easy to understand • Quickly to identify anomalous events of significance • Help network operators secure and manage networks Speaker: Li-Ming Chen
Methodology raw packets flows clusters clusters • Data pre-processing • Aggregate packets into 5-tuple flows • Group flows into clusters • Extract significant clusters • Data reduction step using entropy • Classify cluster behavior • Based on similarity/dis-similarity of communication patterns • Clusters classified into behavior classes (BCs) • Interpret behavior classes • Structural modeling for dominant activities clusters clusters clusters clusters clusters clusters clusters clusters clusters BC2 BC1 clusters clusters clusters dstPrt(.) → (srcPrt(*), dstIP(*)) [Scanning attacks] srcPrt(.) → dstIP(…) → dstPrt(*) [server replying to a few hosts] Speaker: Li-Ming Chen
Datasets (& data pre-processing) • Collect packet header traces from multiple backbone links in a large ISP network (Sprint) • Aggregate packets header traces into 5-tuple flows • Group flows associated with same end hosts/ports into clusters • 4-feature space: srcIP, dstIP, srcPrt, dstPrt Speaker: Li-Ming Chen
Outline • Introduction • Background: Entropy & Relative Uncertainty (RU) • Significant Clusters Extraction • Cluster Behavioral Classification • Structural Models (Cluster Behavioral Interpretation) • Applications • Conclusion & Comments Speaker: Li-Ming Chen
Entropy • Assume a random variable X has NX discrete values • Suppose we randomly sample X for m times • An empirical probability distribution on X: • The (empirical) entropy of X: • Max. Entropy of X: Speaker: Li-Ming Chen
Relatively Uncertainty (RU) • RU of X: • Provide an index of variety or uniformity regardless of the support or sample size • RU(X) -> 0, X is deterministic • Most of the observations of X are of the same kind • RU(X) -> 1, X is randomly distributed • All observation of X are different or unique • Nearly indistinguishable.. when m < NX Speaker: Li-Ming Chen
Outline • Introduction • Background: Entropy & Relative Uncertainty (RU) • Significant Clusters Extraction • Cluster Behavioral Classification • Structural Models (Cluster Behavioral Interpretation) • Applications • Conclusion & Comments Speaker: Li-Ming Chen
Extract Significant Clusters • Focus on significant clusters • Sufficiently large amount of flows • Represent behavior of significant interest • (One definition) using a fixed threshold • A clustering is significant if containing at least x% of flows • How to choose “x” for all links? • (Authors’ definition)adaptive thresholding using RU • A cluster is significant if “stand out” from the rest • Use RU to quantify whether the rest looks random! Speaker: Li-Ming Chen
Extract Significant Clusters (an example) • S is a subset of A, • say S contains the most significant values of A if S is the smallest subset of A such that: • the prob. of any value in S is larger than those of the remaining values • the (conditional) prob. distribution on the set of the remaining values R := A – S, is close to being uniformly distributed • An efficient approximation algorithm is presented.. β = 0.9 Speaker: Li-Ming Chen
An Approximation Algorithm(for significant clusters extraction) (e.g., α0 = 2%) Feature values of A is ordered based on their prob., PA(a1) > PA(a2) > … End at largest “cut-off” threshold, The remaining R is close to uniformly distributed.. Speaker: Li-Ming Chen
Clusters (Significant vs. Total) srcIP dimension dstIP dimension clusters are extracted in every 5-minute time slot. srcPrt dimension dstPrt dimension Speaker: Li-Ming Chen
Clusters (Significant vs. Total) - Cut-off threshold in the Approx. Algo. srcIP dimension dstIP dimension srcPrt dimension dstPrt dimension Speaker: Li-Ming Chen
Summary: Significant Clusters • (Observation) Behavior changes: • While the total number of distinct values (clusters) may not fluctuate very much, the number of significant feature values (clusters) may vary dramatically. • Also result in different cut-off threshold being used • The dramatic changes in the number of significant clusters also signifies major changes in the underlying traffic patterns Speaker: Li-Ming Chen
Outline • Introduction • Background: Entropy & Relative Uncertainty (RU) • Significant Clusters Extraction • Cluster Behavioral Classification • Structural Models (Cluster Behavioral Interpretation) • Applications • Conclusion & Comments Speaker: Li-Ming Chen
Behavior characterization • The flows in each cluster share the same cluster key (i.e., srcIP) • The other 3 “free” dimensions can take any possible value (exhibit some behaviors) • RU vector [RUX, RUY, RUZ](3 free dim.) • (e.g.) RU vector of a srcIP cluster is • [RUsrcPrt, RUdstPrt, RUdstIP] (one-hour) low medium high srcPrt dstPrt srcIP (multi-modal) Speaker: Li-Ming Chen
Behavior classifications • Group clusters into similar behaviors (RU vector) • [L(RUX), L(RUY), L(RUZ)] {0, 1, 2}3 27 possible BCs Speaker: Li-Ming Chen
27 Behavior Classes • What is the difference between BCs? • Are there common vs. rear BCs? • Are BCs have many or a few clusters? • Are membership in BCs stable? • Temporal properties of BCs (the metrics): • Popularity: number of times we observe a particular BC appearing • (Avg.) Size: avg. number of clusters belonging to a given BC • (Membership) Volatility: does a BC contain the same clusters over time? Speaker: Li-Ming Chen
BC2 BC20 num. of Unique Clusters BC2 BC20 High Volatility (BC2, BC20) (24-hour) Speaker: Li-Ming Chen (24-hour)
How about Individual Clusters? • Behavior characteristics of individual clusters over time (Dynamic or Stable ?) • The relation between the frequency of a cluster and the BCs it appears in • The behavior stability of a cluster if it appears multiple times • Whether a cluster tends to re-appear in the same BC or different BC’s? Speaker: Li-Ming Chen
Behavior of Individual Clusters (heavy-tailed distrbution) Most frequent clusters all fall into the five popular but non-volatile BCs. (BC6, BC7, BC8, BC18, BC19) Majority of the least frequent clusters belong to BC2 and BC20 (log-log scale) A2: few behavior transitions & most of the behavior transitions are between akin BSs. 89.6% Cluster ID ordered based on its frequency 90.3% Speaker: Li-Ming Chen
Summary: Behavior Classifications • Behavior classes classify similar clusters based on communication patterns • Behavior classes have distinct temporal properties • Popularity, avg. size and membership volatility • Clusters in general evince consistent behavior over time • How can we interpret observed behavior ?? Speaker: Li-Ming Chen
Outline • Introduction • Background: Entropy & Relative Uncertainty (RU) • Significant Clusters Extraction • Cluster Behavioral Classification • Structural Models (Cluster Behavioral Interpretation) • Applications • Conclusion & Comments Speaker: Li-Ming Chen
Dominant State Analysis • Each cluster has hundreds or thousands of flows • An exhaustive approach is not practical • Need a compact summary • Dominant State Analysis • Explore dominant activities of the clusters • Observation: • Clusters within the same BCs have similar structural models • They could have different dominant state (or activities) Speaker: Li-Ming Chen
How? (Structural Modeling) • An Example: (A web server from srcIP perspective) • Re-order the 3 free dimensions based on their RU values (i.e., A<B<C) • RUsrcPrt < RUdstIP < RUdstPrt • Find substantial values in A, B and C hierarchically (conditionally) • srcPrt 80 has 95% • srcPrt 80, dstIP 1 has 50 % • srcPrt 80, dstIP 1, dstPort 1025 has x%... clusters srcPrt 443 srcPrt 80 5% 95% dstIP … dstIP 1 50% <1% dstPrt 1025 dstPrt … <1% …% Speaker: Li-Ming Chen
Dominant State for srcIP ‧ specific value … multiple values * any (large number of the target) Speaker: Li-Ming Chen
Outline • Introduction • Background: Entropy & Relative Uncertainty (RU) • Significant Clusters Extraction • Cluster Behavioral Classification • Structural Models (Cluster Behavioral Interpretation) • Applications • Conclusion & Comments Speaker: Li-Ming Chen
Canonical behavior profiles • Large majority of the (significant) clusters fall into three canonical profiles: (variability) (avg. flow sizes per cluster) [0,2,x] [2,0,x] [2,0,x] [0,2,x] Speaker: Li-Ming Chen [x,0,2]
Deviant or Rare Behaviors • Building a comprehensive traffic profile can also lead to the identification of possible deviant behaviors • Clusters in rare behavior classes • e.g., dstPrt BC15 [1,2,0] -> DDoS • Behavior changes for clusters • e.g., srcIP (a Yahoo web server) BC8 -> BC6 -> BC8 • Unusual profiles for popular service ports • Clusters associated with common service ports should follow their canonical profiles.. Speaker: Li-Ming Chen
Outline • Introduction • Background: Entropy & Relative Uncertainty (RU) • Significant Clusters Extraction • Cluster Behavioral Classification • Structural Models (Cluster Behavioral Interpretation) • Applications • Conclusion & Comments Speaker: Li-Ming Chen
Conclusion • Develop a systematic methodology to automatically discover and interpret communication patterns • Use information-theoretical techniques to build behavior models of end hosts and applications • Apply dominant state analysis to explain traffic behavior • Discover typical behavior profiles as well as rare and deviant behaviors Speaker: Li-Ming Chen
Comments • Observe the behavior in different points of view.. • Flow (source - destination) • Connection (initiator/requester – replier/responder) • Connection-level statistics • Lack of P2P application analysis • Hard to choose the observation period for generating and analyzing clusters • Tradeoff between the timeliness and data size • Correlating behavior profiles across multiple backbone links Speaker: Li-Ming Chen