1 / 37

Internet Measurement Conference 2003 27-29 Of October, 2003 Miami, Florida, USA

Internet Measurement Conference 2003 27-29 Of October, 2003 Miami, Florida, USA http://www.icir.org/vern/imc-2003/ Date for student travel grant applications: Sept 5th. Shannon Lab.

vlora
Download Presentation

Internet Measurement Conference 2003 27-29 Of October, 2003 Miami, Florida, USA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Internet Measurement Conference 2003 27-29 Of October, 2003 Miami, Florida, USA http://www.icir.org/vern/imc-2003/ Date for student travel grant applications: Sept 5th AT&T Labs - Research

  2. Shannon Lab An Information-Theoretic Approach to Traffic Matrix EstimationYin Zhang, Matthew Roughan, Carsten Lund – AT&T ResearchDavid Donoho – Stanford AT&T Labs - Research

  3. Problem Have link traffic measurements Want to know demands from source to destination B C A AT&T Labs - Research

  4. Example App: reliability analysis Under a link failure, routes change want to find an traffic invariant B C A AT&T Labs - Research

  5. Approach Principle * “Don’t try to estimate something if you don’t have any information about it” • Maximum Entropy • Entropy is a measure of uncertainty • More information = less entropy • To include measurements, maximize entropy subject to the constraints imposed by the data • Impose the fewest assumptions on the results • Instantiation: Maximize “relative entropy” • Minimum Mutual Information AT&T Labs - Research

  6. Mathematical Formalism Only measure traffic at links Traffic y1 1 link 1 router 2 link 2 link 3 3 AT&T Labs - Research

  7. Mathematical Formalism Traffic y1 1 Traffic matrix element x1 route 1 router 2 route 3 route 2 3 Problem: Estimate traffic matrix (x’s) from the link measurements (y’s) AT&T Labs - Research

  8. Mathematical Formalism 1 route 1 router 2 route 3 route 2 3 Problem: Estimate traffic matrix (x’s) from the link measurements (y’s) AT&T Labs - Research

  9. Mathematical Formalism 1 route 1 router 2 route 3 route 2 3 Problem: Estimate traffic matrix (x’s) from the link measurements (y’s) AT&T Labs - Research

  10. Mathematical Formalism 1 route 1 router 2 route 3 Routing matrix route 2 3 y = Ax For non-trivial network UNDERCONSTRAINED AT&T Labs - Research

  11. Regularization • Want a solution that satisfies constraints: y = Ax • Many more unknowns than measurement: O(N2) vs O(N) • Underconstrained system • Many solutions satisfy the equations • Must somehow choose the “best” solution • Such (ill-posed linear inverse) problems occur in • Medical imaging: e.g CAT scans • Seismology • Astronomy • Statistical intuition => Regularization • Penalty function J(x) • solution: AT&T Labs - Research

  12. How does this relate to other methods? • Previous methods are just particular cases of J(x) • Tomogravity (Zhang, Roughan, Greenberg and Duffield) • J(x) is a weighted quadratic distance from a gravity model • A very natural alternative • Start from a penalty function that satisfies the “maximum entropy” principle • Minimum Mutual Information AT&T Labs - Research

  13. Minimum Mutual Information (MMI) • Mutual Information I(S,D) • Information gained about Source from Destination • I(S,D) = -relative entropy with respect to independent S and D I(S,D) = 0 S and D are independent p(D|S) = p(D) gravity model • Natural application of principle * • Assume independence in the absence of other information • Aggregates have similar behavior to network overall • When we get additional information (e.g. y = Ax) • Maximize entropy  Minimize I(S,D) (subject to constraints) • J(x) = I(S,D) equivalent AT&T Labs - Research

  14. MMI in practice • In general there aren’t enough constraints • Constraints give a subspace of possible solutions y = Ax AT&T Labs - Research

  15. MMI in practice • Independence gives us a starting point independent solution y = Ax AT&T Labs - Research

  16. MMI in practice • Find a solution which • Satisfies the constraint • Is closest to the independent solution solution Distance measure is the Kullback-Lieber divergence AT&T Labs - Research

  17. Is that it? • Not quite that simple • Need to do some networking specific things • e.g. conditional independence to model hot-potato routing • Can be solved using standard optimization toolkits • Taking advantage of sparseness of routing matrix A • Back to tomogravity • Conditional independence = generalized gravity model • Quadratic distance function is a first order approximation to the Kullback-Leibler divergence • Tomogravity is a first-order approximation to MMI AT&T Labs - Research

  18. Results – Single example • ±20% bounds for larger flows • Average error ~11% • Fast (< 5 seconds) • Scales: • O(100) nodes AT&T Labs - Research

  19. More results Large errors are in small flows >80% of demands have <20% error tomogravity method simple approximation AT&T Labs - Research

  20. Other experiments • Sensitivity • Very insensitive to lambda • Simple approximations work well • Robustness • Missing data • Erroneous link data • Erroneous routing data • Dependence on network topology • Via Rocketfuel network topologies • Additional information • Netflow • Local traffic matrices AT&T Labs - Research

  21. Dependence on Topology star (20 nodes) clique AT&T Labs - Research

  22. Additional information – Netflow AT&T Labs - Research

  23. Local traffic matrix (George Varghese) for reference previous case 0% 1% 5% 10% AT&T Labs - Research

  24. Conclusion • We have a good estimation method • Robust, fast, and scales to required size • Accuracy depends on ratio of unknowns to measurements • Derived from principle • Approach gives some insight into other methods • Why they work – regularization • Should provide better idea of the way forward • Additional insights about the network and traffic • Traffic and network are connected • Implemented • Used in AT&T’s NA backbone • Accurate enough in practice AT&T Labs - Research

  25. Results • Methodology • Use netflow based partial (~80%) traffic matrix • Simulate SNMP measurements using routing sim, and y = Ax • Compare estimates, and true traffic matrix • Advantage • Realistic network, routing, and traffic • Comparison is direct, we know errors are due to algorithm not errors in the data • Can do controlled experiments (e.g. introduce known errors) • Data • One hour traffic matrices (don’t need fine grained data) • 506 data sets, comprising the majority of June 2002 • Includes all times of day, and days of week AT&T Labs - Research

  26. Robustness (input errors) AT&T Labs - Research

  27. Robustness (missing data) AT&T Labs - Research

  28. Point-to-multipoint We don’t see whole Internet – What if an edge link fails? Point-to-point traffic matrix isn’t invariant AT&T Labs - Research

  29. Point-to-multipoint • Included in this approach • Implicit in results above • Explicit results worse • Ambiguity in demands in increased • More demands use exactly the same sets of routes • use in applications is better Link failure analysis Point-to-multipoint Point-to-point AT&T Labs - Research

  30. Independent model AT&T Labs - Research

  31. peer links access links Conditional independence • Internet routing is asymmetric • A provider can control exit points for traffic going to peer networks AT&T Labs - Research

  32. peer links access links Conditional independence • Internet routing is asymmetric • A provider can control exit points for traffic going to peer networks • Have much less control of where traffic enters AT&T Labs - Research

  33. Conditional independence AT&T Labs - Research

  34. Minimum Mutual Information (MMI) • Mutual Information I(S,D)=0 • Information gained about S from D • I(S,D) = relative entropy with respect to independence • Can also be given by Kullback-Leibler information divergence • Why this model • In the absence of information, let’s assume no information • Minimal assumption about the traffic • Large aggregates tend to behave like overall network? AT&T Labs - Research

  35. Dependence on Topology * These are not the actual networks, but only estimates made by Rocketfuel AT&T Labs - Research

  36. Bayesian (e.g. Tebaldi and West) • J(x) = -log(x), where (x) is the prior model • MLE (e.g. Vardi, Cao et al, …) • In their thinking the prior model generates extra constraints • Equally, can be modeled as a (complicated) penalty function • Uses deviations from higher order moments predicted by model AT&T Labs - Research

  37. Acknowledgements • Local traffic matrix measurements • George Varghese • PDSCO optimization toolkit for Matlab • Michael Saunders • Data collection • Fred True, Joel Gottlieb • Tomogravity • Albert Greenberg and Nick Duffield AT&T Labs - Research

More Related