1 / 55

The Evolution of Traffic Matrix Techniques and Applications: Past, Present and Future

The Evolution of Traffic Matrix Techniques and Applications: Past, Present and Future. Fan Tongliang 20081201005 College of Communication Engineering. Outline. Problem Statement Summary: traffic measurement How have traffic matrix estimation techniques evolved?

trista
Download Presentation

The Evolution of Traffic Matrix Techniques and Applications: Past, Present and Future

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Evolution of Traffic Matrix Techniques and Applications: Past, Present and Future Fan Tongliang 20081201005 College of Communication Engineering

  2. Outline • Problem Statement • Summary: traffic measurement • How have traffic matrix estimation techniques evolved? • Applications of traffic matrices • Reference

  3. Internet Evolution Grows over time… 2

  4. Internet Evolution Say, network doubles in size Key: Where to add capacity? 2

  5. Internet Evolution Uniformly scale all capacities? Moore’s-law like scaling sufficient? If so, good scaling! 2

  6. Internet Evolution Scale some links faster? Moore’s-law like scaling insufficient? 2

  7. Internet Evolution Scale some links faster? Congested hot-spots If so, poor scaling!! 2

  8. Internet Evolution • How does the worst congestion grow? • Ideal: O(n) • How much of this is due to… • Topology? • “Power-law” structure • Routing algorithm? • BGP-Policy routing • Traffic demand matrix? • Uniform vs. non-uniform • What can be done? • Redesign the network? 2

  9. Why Measurements ? Optimizing the Internet performance of well-connected Internet end-points • Wide-area bottlenecks • Identify and characterize bottlenecks • Multihoming route control • Quantify benefits and compare against alternatives • Will these techniques work in the future? • Current best performingBGP path • Smartselection 1

  10. Why Measurements are Difficult • To effectively measure the global Internet, wide cooperation is needed. However, ISPs are reluctant to coordinate their efforts • Statistics collection is viewed as a luxury (OC48mon = $100,000) - only large ISPs can afford statistics collection and analysis - demand is still dormant • Best Effort service, low profit margins for ISPs make operational support difficult – data collection is low priority • Traffic volume, high trunk capacity, diversity of protocols, technologies and applications make traffic monitoring and analysis a challenging endeavor • Results get obsolete very rapidly: Internet is under very active development – traffic, technology and topology change very fast • Tremendous growth of Internet – it is difficult to scale measurements • Overprovisioning is a widely practiced solution to network congestion

  11. How Measurement ? • Measurement is: data collection, analysis and visualization • Traffic data: • Network Topology and Mapping (connectivity) • Workload (passive or non-intrusive) • Performance (active) • Routing (BGP routing tables) • Active approach • Inject traffic and wait for arrival to the destination or reply • Passive approach • No traffic injected; Measurements are done over a collection of network monitors

  12. Measurement Tools • Can be classified into hardware and software measurement tools • Hardware: specialized equipment • Examples: HP 4972 LAN Analyzer, DataGeneral Network Sniffer, others... • Software: special software tools • Examples: tcpdump, xtr, SNMP, others...

  13. Measurement Tools (Cont’d) • Measurement tools can also be classified as real-time or non-real-time • Real-time: collects traffic data as it happens, and may even be able to display traffic info as it happens • Non-real-time: collected traffic data may only be a subset (sample) of the total traffic, and is analyzed off-line (later)

  14. Measurement Tools • Link Based Tools • CoralReef • Tcpdump • Router Based Tools • SNMP Based • MRTG • NetFlow Based • FlowScan • Cflowd • MADAS • Flowtools • CISCO NetFlow FlowCollector, NetFlow Data Analyzer

  15. Detecting Performance Problems • High utilization or loss statistics for the link • High delay or low throughput for probes • Angry customers (complaining via phone?) overload!

  16. Two large flows of traffic New egress pointfor first flow Multi-homed customer Network Operations: Excess Traffic

  17. Install packet filter Web server back to life… Web server at its knees… Network Operations: Denial-of-Service Attack

  18. Routing change alleviates congestion Link failure New route overloads a link Network Operations: Link Failure

  19. What’s a traffic matrix? Xj Yi ingress egress Xj PoP (Point of Presence) Y = A X or Y=RX “Traffic Matrix” Link Measurement Vector Routing Matrix

  20. A B 5 3 4 4 C D Example Problem How much traffic flows between origin-destinationpairs? A->C A->D B->C B->D SNMP byte counts per link

  21. Example: One Solution A B How much traffic flows between? A->D: 4 A->C: 1 B->C: 3 B->D: 0 5 3 4 4 0 1 4 3 C D

  22. Example: Another Solution A B How much traffic flows between? A->D: 2 A->C: 3 B->C: 1 B->D: 2 5 3 4 4 2 3 2 1 C D Link 1 type of equations: Link1 = XAD + XBD

  23. Inference: Network Tomography From link counts to the traffic matrix Sources 5Mbps 3Mbps 4Mbps 4Mbps Destinations

  24. 1st Generation Approaches • Linear Programming (LP) approach. • O. Goldschmidt - ISMA Workshop 2000 • Bayesian estimation. • C. Tebaldi, M. West - J. of American Statistical Association, June 1998. • Expectation Maximization (EM) approach. • J. Cao, D. Davis, S. Vander Weil, B. Yu - J. of American Statistical Association, 2000.

  25. Linear Programming • Objective: • Constraints:

  26. Statistical Approaches

  27. Bayesian Approach • Assumes P(Xj) follows a Poisson distribution with mean λj. (independently dist.) • needs to be estimated. (a prior is needed) • Conditioning on link counts: P(X,Λ|Y) Uses Markov Chain Monte Carlo (MCMC) simulation method to get posterior distributions. • Ultimate goal: compute P(X|Y)

  28. Expectation Maximization (EM) • Assumes Xj are ind. dist. Gaussian. • Y=AX implies: • Requires a prior for initialization. • Incorporates multiple sets of link measurements. • Uses EM algorithm to compute MLE.

  29. 2nd generation methods • MOTIVATION: The fundamental problem is that of an under-constrained, or ill-posed, system. some sort of side information or assumptions must then be added to make the estimation problem well-posed. • What options do we have for getting more data into the problem? • Approach 1: • MLE estimation methods require a “starting point” (initial condition/prior/etc) • Can we find “intelligent starting points” based on network properties? • Approach 2: • What can we do to increase the rank of the routing matrix?

  30. Directions • Lessons learned: • Model assumptions do not reflect the true nature of traffic. (multimodal behavior) • Dependence on priors • Link count is not sufficient (Generally more data is available to network operators.) • Proposed Solutions: • Use choice models to incorporate additional information. • Generate a good prior solution: Gravity model. • Information-Theoretic • Assignment model

  31. a12 a13 a14 POP 2 POP 3 POP 1 POP 4 Choice Models • Let Ri be total amount of traffic entering the network that is sourced at POP i • Traffic POP(i->j)= Ri aij • What is aij ? • the proportion of traffic at ingress node ‘i’ headed to egress node ‘j’ • {aij for all j } called the “fanout” • Problem: estimate the fanouts aij

  32. Gravity model • Router-to-router gravity model: [Zhang,Roughan, et. al. Sigcomm04] • Use this to as a smart initial condition for optimization • Solve min ||X – Xg|| s.t. || AX – Y|| is minimized • Use a least squares type solution

  33. Gravity-based OD Flow Model • What does the gravity model say about OD flows? • Assume nodes are independent • The gravity model is a spatial model among OD flows • Gravity model is calibrated using SNMP from access and peering links entering/exiting router nodes • this is not the same SNMP data as the inter-router links used in estimation

  34. Route Change Method • Idea: change the link weights - the new shortest paths computed will lead to new routes between some OD pairs [Soule, Nucci, Cruz, et. al. Sigmetrics04] • Each routing induces a different Y=A(r)*X where A(r) is the routing matrix for weight setting case ‘r’. • Hope: by combining all the linear constraints into one big system, we increase the rank of A from the original system. It works! • Caveat: the SNMP link counts from different routing configurations need to be collected over many hours or even days -> so we are in the non-stationary regime of OD traffic flows.

  35. An Information-Theoretic Approach • Maximum Entropy • Entropy is a measure of uncertainty • More information = less entropy • To include measurements, maximize entropy subject to the constraints imposed by the data • Impose the fewest assumptions on the results • Instantiation: Maximize “relative entropy” • Minimum Mutual Information

  36. Assignment model • We may see our problem as follows. and =1 • The ultimate value of OD pair can be described by: and =1

  37. 3rd generation models • Carriers set a 10% average error rate as general target. • 2nd generation methods achieving average errors in the range of 15-20%, roughly. • Can we further reduce errors? • What other kinds of information/measurements can be brought into the picture?

  38. Two-step Statistical Approach • First step: Mlogit and Linear Choice Models • Step 2: Expectation Maximization Algorithm • The division of the TM estimationprocess into two steps offers great flexibility for combiningand evaluating different strategies that could be applied to solve theinference problem.

  39. Tomogravity • Two step modeling. • Gravity Model: Initial solution obtained using edge link load data and ISP routing policy. • Tomographic Estimation: Initial solution is refined by applying quadratic programming to minimize distance to initial solution subject to tomographic constraints (link counts).

  40. Genetic-Assignment algorithm • the key link C= RRT • “troublesome” OD pairs Q= RTR • problem

  41. PCA Method • Using the measured time series of all the OD flows - do PCA analysis • output of PCA: eigenflows – a new time series • cyclical ones, bursty ones, and noisy ones • Each OD flow can be represented by a weighted sum of a small number (<10) eigenflows

  42. PCA Solution • Rather than estimate the traffic matrix, estimate the eigenflows (elements of the low-dim representation) • this is well posed. • Rebuild the traffic matrix using the appropriate weighted sum of the eigenflows.

  43. Issues in 3rd gen methods • Model Recalibration: need to keep models up to date as traffic evolves • For models based on 24-hours of measurements: need scheme for detecting change and deciding when to launch a new measurement collection episode. • For model with 1-flow at a time, no change detection needed; the model is essentially self updating on an ongoing basis. • Overheads • a tradeoff is induced: measurement overhead versus gain in error reduction

  44. Areas of Application • Route selection • how to choose link weights for shortest path routing • Evaluating the impact of policy changes on traffic • Anomaly detection

  45. Application Area #1:Selecting Link Weights for Routing • Link weights selection algorithms use a traffic matrix as input. Goal: balance traffic across links well. • suppose the input TM has errors? • how does this affect our ability to choose routes? • Want a set of routes to last many days without requiring changes. But the TM is a dynamic fluctuating thing. • Can a single set of weights be good for along time, i.e., over a variety of TMs?

  46. Application Area #1: Some Findings • [Roughan et. al. IMC03] • yes there is some sensitivity, but it’s not too bad • except: “optimal” routing (MPLS) is more sensitive than near-optimal algorithms (OSPF) • can find a routing that is robust to daily fluctuations • [Applegate/Cohen Sigcomm03] • theoretical result, using oblivious routing... • showed that can find a single routing that works well under a wide variety of cases of traffic matrices

  47. Application Area #2:Impact of Routing Policy Change • Using a TM, can get broad view of policy changes • Questions: • what kinds of fluctuations do we see in the TM due to changes in internal routing (IGP) ? [Agarwal, et. al. Sigmetrics04] • what kinds of fluctuations do we see in the TM due to changes in inter-domain routing (BGP)? [Teixeira, et. al. PAM05] • Answer: Not often, but when they happen, they are big (affect a lot of traffic).

  48. Application Area #3:Anomaly Detection • A set of traffic matrices over time can be used to describe “normal” traffic. • We now have lots of models for OD flows. • Can we then identify abnormalities? • Subspace Method [Lakhina, et. al. SIGCOMM04] • Builds on the PCA idea - projects traffic flows onto low-dimensional representation and extracts outliers • There is much more that can be done here ...

  49. Application Area #3:Anomaly Detection • Advantages of using TMs for security: have network-wide perspective • If see attack on a set of links, maybe this all belongs to one OD flow, i.e., the same attack • permits easy identification of point of entry • If one attacker attacked multiple victims, anomalies show up in one row of a TM • If multiple zombies attack a single victim, anomalies show up in a column of the TM

  50. Traffic Matrix: Operational Uses • Short-term congestion and performance problems • Problem: predicting link loads after a routing change • Map the traffic matrix onto the new set of routes • Long-term congestion and performance problems • Problem: predicting link loads after topology changes • Map traffic matrix onto the routes on new topology • Reliability despite equipment failures • Problem: allocating spare capacity for failover • Find link weights such that no failure causes overload

More Related