1 / 35

Profiling Internet Backbone Traffic: Behavior Models and Applications

Profiling Internet Backbone Traffic: Behavior Models and Applications. Kuai Xu, Zhi-Li Zhang (Univ. of Minnesota) Supratik Bhattacharyya (Sprint ATL) SIGCOMM 2005. Outline. Introduction Background: Entropy & Relative Uncertainty (RU) Significant Clusters Extraction

august
Download Presentation

Profiling Internet Backbone Traffic: Behavior Models and Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Profiling Internet Backbone Traffic: Behavior Models and Applications Kuai Xu, Zhi-Li Zhang (Univ. of Minnesota) Supratik Bhattacharyya (Sprint ATL) SIGCOMM 2005

  2. Outline • Introduction • Background: Entropy & Relative Uncertainty (RU) • Significant Clusters Extraction • Cluster Behavioral Classification • Structural Models (Cluster Behavioral Interpretation) • Applications • Conclusion & Comments Speaker: Li-Ming Chen

  3. Introduction • Why profile traffic? • Changes in Internet traffic dynamics • Increase in unwanted traffic • Wide diversity of end-hosts, applications and services • New services on traditional ports • Traditional services on non-standard ports • Existing tools • Port-based, volume-based, content-based • Need better techniques to discover behavior patterns (especially interesting behavior..) Speaker: Li-Ming Chen

  4. Problem Settings • Problems • How to characterize communication patterns? • Are these patterns meaningful? • How to automatically discover such patterns? • Challenges • Vast amount of traffic data • Large number of end hosts • Diverse applications • A more specific problem settings • Use one-way traffic data from single backbone link • Use only packet header information • No assumption of normal (or anomalous) behavior Speaker: Li-Ming Chen

  5. Objectives • Develop a general methodology for profiling Internet backbone traffic • Automatically discovers significant behaviors of interest from massive traffic data • Automatically interpret these behaviors • Easy to understand • Quickly to identify anomalous events of significance • Help network operators secure and manage networks Speaker: Li-Ming Chen

  6. Methodology raw packets flows clusters clusters • Data pre-processing • Aggregate packets into 5-tuple flows • Group flows into clusters • Extract significant clusters • Data reduction step using entropy • Classify cluster behavior • Based on similarity/dis-similarity of communication patterns • Clusters classified into behavior classes (BCs) • Interpret behavior classes • Structural modeling for dominant activities clusters clusters clusters clusters clusters clusters clusters clusters clusters BC2 BC1 clusters clusters clusters dstPrt(.) → (srcPrt(*), dstIP(*)) [Scanning attacks] srcPrt(.) → dstIP(…) → dstPrt(*) [server replying to a few hosts] Speaker: Li-Ming Chen

  7. Datasets (& data pre-processing) • Collect packet header traces from multiple backbone links in a large ISP network (Sprint) • Aggregate packets header traces into 5-tuple flows • Group flows associated with same end hosts/ports into clusters • 4-feature space: srcIP, dstIP, srcPrt, dstPrt Speaker: Li-Ming Chen

  8. Outline • Introduction • Background: Entropy & Relative Uncertainty (RU) • Significant Clusters Extraction • Cluster Behavioral Classification • Structural Models (Cluster Behavioral Interpretation) • Applications • Conclusion & Comments Speaker: Li-Ming Chen

  9. Entropy • Assume a random variable X has NX discrete values • Suppose we randomly sample X for m times • An empirical probability distribution on X: • The (empirical) entropy of X: • Max. Entropy of X: Speaker: Li-Ming Chen

  10. Relatively Uncertainty (RU) • RU of X: • Provide an index of variety or uniformity regardless of the support or sample size • RU(X) -> 0, X is deterministic • Most of the observations of X are of the same kind • RU(X) -> 1, X is randomly distributed • All observation of X are different or unique • Nearly indistinguishable.. when m < NX Speaker: Li-Ming Chen

  11. Outline • Introduction • Background: Entropy & Relative Uncertainty (RU) • Significant Clusters Extraction • Cluster Behavioral Classification • Structural Models (Cluster Behavioral Interpretation) • Applications • Conclusion & Comments Speaker: Li-Ming Chen

  12. Extract Significant Clusters • Focus on significant clusters • Sufficiently large amount of flows • Represent behavior of significant interest • (One definition) using a fixed threshold • A clustering is significant if containing at least x% of flows • How to choose “x” for all links? • (Authors’ definition)adaptive thresholding using RU • A cluster is significant if “stand out” from the rest • Use RU to quantify whether the rest looks random! Speaker: Li-Ming Chen

  13. Extract Significant Clusters (an example) • S is a subset of A, • say S contains the most significant values of A if S is the smallest subset of A such that: • the prob. of any value in S is larger than those of the remaining values • the (conditional) prob. distribution on the set of the remaining values R := A – S, is close to being uniformly distributed • An efficient approximation algorithm is presented.. β = 0.9 Speaker: Li-Ming Chen

  14. An Approximation Algorithm(for significant clusters extraction) (e.g., α0 = 2%) Feature values of A is ordered based on their prob., PA(a1) > PA(a2) > … End at largest “cut-off” threshold, The remaining R is close to uniformly distributed.. Speaker: Li-Ming Chen

  15. Clusters (Significant vs. Total) srcIP dimension dstIP dimension clusters are extracted in every 5-minute time slot. srcPrt dimension dstPrt dimension Speaker: Li-Ming Chen

  16. Clusters (Significant vs. Total) - Cut-off threshold in the Approx. Algo. srcIP dimension dstIP dimension srcPrt dimension dstPrt dimension Speaker: Li-Ming Chen

  17. Summary: Significant Clusters • (Observation) Behavior changes: • While the total number of distinct values (clusters) may not fluctuate very much, the number of significant feature values (clusters) may vary dramatically. • Also result in different cut-off threshold being used • The dramatic changes in the number of significant clusters also signifies major changes in the underlying traffic patterns Speaker: Li-Ming Chen

  18. Outline • Introduction • Background: Entropy & Relative Uncertainty (RU) • Significant Clusters Extraction • Cluster Behavioral Classification • Structural Models (Cluster Behavioral Interpretation) • Applications • Conclusion & Comments Speaker: Li-Ming Chen

  19. Behavior characterization • The flows in each cluster share the same cluster key (i.e., srcIP) • The other 3 “free” dimensions can take any possible value (exhibit some behaviors) • RU vector [RUX, RUY, RUZ](3 free dim.) • (e.g.) RU vector of a srcIP cluster is • [RUsrcPrt, RUdstPrt, RUdstIP] (one-hour) low medium high srcPrt dstPrt srcIP (multi-modal) Speaker: Li-Ming Chen

  20. Behavior classifications • Group clusters into similar behaviors (RU vector) • [L(RUX), L(RUY), L(RUZ)] {0, 1, 2}3 27 possible BCs Speaker: Li-Ming Chen

  21. 27 Behavior Classes • What is the difference between BCs? • Are there common vs. rear BCs? • Are BCs have many or a few clusters? • Are membership in BCs stable? • Temporal properties of BCs (the metrics): • Popularity: number of times we observe a particular BC appearing • (Avg.) Size: avg. number of clusters belonging to a given BC • (Membership) Volatility: does a BC contain the same clusters over time? Speaker: Li-Ming Chen

  22. BC2 BC20 num. of Unique Clusters BC2 BC20 High Volatility (BC2, BC20) (24-hour) Speaker: Li-Ming Chen (24-hour)

  23. How about Individual Clusters? • Behavior characteristics of individual clusters over time (Dynamic or Stable ?) • The relation between the frequency of a cluster and the BCs it appears in • The behavior stability of a cluster if it appears multiple times • Whether a cluster tends to re-appear in the same BC or different BC’s? Speaker: Li-Ming Chen

  24. Behavior of Individual Clusters (heavy-tailed distrbution) Most frequent clusters all fall into the five popular but non-volatile BCs. (BC6, BC7, BC8, BC18, BC19) Majority of the least frequent clusters belong to BC2 and BC20 (log-log scale) A2: few behavior transitions & most of the behavior transitions are between akin BSs. 89.6% Cluster ID ordered based on its frequency 90.3% Speaker: Li-Ming Chen

  25. Summary: Behavior Classifications • Behavior classes classify similar clusters based on communication patterns • Behavior classes have distinct temporal properties • Popularity, avg. size and membership volatility • Clusters in general evince consistent behavior over time • How can we interpret observed behavior ?? Speaker: Li-Ming Chen

  26. Outline • Introduction • Background: Entropy & Relative Uncertainty (RU) • Significant Clusters Extraction • Cluster Behavioral Classification • Structural Models (Cluster Behavioral Interpretation) • Applications • Conclusion & Comments Speaker: Li-Ming Chen

  27. Dominant State Analysis • Each cluster has hundreds or thousands of flows • An exhaustive approach is not practical • Need a compact summary • Dominant State Analysis • Explore dominant activities of the clusters • Observation: • Clusters within the same BCs have similar structural models • They could have different dominant state (or activities) Speaker: Li-Ming Chen

  28. How? (Structural Modeling) • An Example: (A web server from srcIP perspective) • Re-order the 3 free dimensions based on their RU values (i.e., A<B<C) • RUsrcPrt < RUdstIP < RUdstPrt • Find substantial values in A, B and C hierarchically (conditionally) • srcPrt 80 has 95% • srcPrt 80, dstIP 1 has 50 % • srcPrt 80, dstIP 1, dstPort 1025 has x%... clusters srcPrt 443 srcPrt 80 5% 95% dstIP … dstIP 1 50% <1% dstPrt 1025 dstPrt … <1% …% Speaker: Li-Ming Chen

  29. Dominant State for srcIP ‧ specific value … multiple values * any (large number of the target) Speaker: Li-Ming Chen

  30. Outline • Introduction • Background: Entropy & Relative Uncertainty (RU) • Significant Clusters Extraction • Cluster Behavioral Classification • Structural Models (Cluster Behavioral Interpretation) • Applications • Conclusion & Comments Speaker: Li-Ming Chen

  31. Canonical behavior profiles • Large majority of the (significant) clusters fall into three canonical profiles: (variability) (avg. flow sizes per cluster) [0,2,x] [2,0,x] [2,0,x] [0,2,x] Speaker: Li-Ming Chen [x,0,2]

  32. Deviant or Rare Behaviors • Building a comprehensive traffic profile can also lead to the identification of possible deviant behaviors • Clusters in rare behavior classes • e.g., dstPrt BC15 [1,2,0] -> DDoS • Behavior changes for clusters • e.g., srcIP (a Yahoo web server) BC8 -> BC6 -> BC8 • Unusual profiles for popular service ports • Clusters associated with common service ports should follow their canonical profiles.. Speaker: Li-Ming Chen

  33. Outline • Introduction • Background: Entropy & Relative Uncertainty (RU) • Significant Clusters Extraction • Cluster Behavioral Classification • Structural Models (Cluster Behavioral Interpretation) • Applications • Conclusion & Comments Speaker: Li-Ming Chen

  34. Conclusion • Develop a systematic methodology to automatically discover and interpret communication patterns • Use information-theoretical techniques to build behavior models of end hosts and applications • Apply dominant state analysis to explain traffic behavior • Discover typical behavior profiles as well as rare and deviant behaviors Speaker: Li-Ming Chen

  35. Comments • Observe the behavior in different points of view.. • Flow (source - destination) • Connection (initiator/requester – replier/responder) • Connection-level statistics • Lack of P2P application analysis • Hard to choose the observation period for generating and analyzing clusters • Tradeoff between the timeliness and data size • Correlating behavior profiles across multiple backbone links Speaker: Li-Ming Chen

More Related