370 likes | 549 Views
Internet Measurement Conference 2003 27-29 Of October, 2003 Miami, Florida, USA http://www.icir.org/vern/imc-2003/ Date for student travel grant applications: Sept 5th. Shannon Lab.
E N D
Internet Measurement Conference 2003 27-29 Of October, 2003 Miami, Florida, USA http://www.icir.org/vern/imc-2003/ Date for student travel grant applications: Sept 5th AT&T Labs - Research
Shannon Lab An Information-Theoretic Approach to Traffic Matrix EstimationYin Zhang, Matthew Roughan, Carsten Lund – AT&T ResearchDavid Donoho – Stanford AT&T Labs - Research
Problem Have link traffic measurements Want to know demands from source to destination B C A AT&T Labs - Research
Example App: reliability analysis Under a link failure, routes change want to find an traffic invariant B C A AT&T Labs - Research
Approach Principle * “Don’t try to estimate something if you don’t have any information about it” • Maximum Entropy • Entropy is a measure of uncertainty • More information = less entropy • To include measurements, maximize entropy subject to the constraints imposed by the data • Impose the fewest assumptions on the results • Instantiation: Maximize “relative entropy” • Minimum Mutual Information AT&T Labs - Research
Mathematical Formalism Only measure traffic at links Traffic y1 1 link 1 router 2 link 2 link 3 3 AT&T Labs - Research
Mathematical Formalism Traffic y1 1 Traffic matrix element x1 route 1 router 2 route 3 route 2 3 Problem: Estimate traffic matrix (x’s) from the link measurements (y’s) AT&T Labs - Research
Mathematical Formalism 1 route 1 router 2 route 3 route 2 3 Problem: Estimate traffic matrix (x’s) from the link measurements (y’s) AT&T Labs - Research
Mathematical Formalism 1 route 1 router 2 route 3 route 2 3 Problem: Estimate traffic matrix (x’s) from the link measurements (y’s) AT&T Labs - Research
Mathematical Formalism 1 route 1 router 2 route 3 Routing matrix route 2 3 y = Ax For non-trivial network UNDERCONSTRAINED AT&T Labs - Research
Regularization • Want a solution that satisfies constraints: y = Ax • Many more unknowns than measurement: O(N2) vs O(N) • Underconstrained system • Many solutions satisfy the equations • Must somehow choose the “best” solution • Such (ill-posed linear inverse) problems occur in • Medical imaging: e.g CAT scans • Seismology • Astronomy • Statistical intuition => Regularization • Penalty function J(x) • solution: AT&T Labs - Research
How does this relate to other methods? • Previous methods are just particular cases of J(x) • Tomogravity (Zhang, Roughan, Greenberg and Duffield) • J(x) is a weighted quadratic distance from a gravity model • A very natural alternative • Start from a penalty function that satisfies the “maximum entropy” principle • Minimum Mutual Information AT&T Labs - Research
Minimum Mutual Information (MMI) • Mutual Information I(S,D) • Information gained about Source from Destination • I(S,D) = -relative entropy with respect to independent S and D I(S,D) = 0 S and D are independent p(D|S) = p(D) gravity model • Natural application of principle * • Assume independence in the absence of other information • Aggregates have similar behavior to network overall • When we get additional information (e.g. y = Ax) • Maximize entropy Minimize I(S,D) (subject to constraints) • J(x) = I(S,D) equivalent AT&T Labs - Research
MMI in practice • In general there aren’t enough constraints • Constraints give a subspace of possible solutions y = Ax AT&T Labs - Research
MMI in practice • Independence gives us a starting point independent solution y = Ax AT&T Labs - Research
MMI in practice • Find a solution which • Satisfies the constraint • Is closest to the independent solution solution Distance measure is the Kullback-Lieber divergence AT&T Labs - Research
Is that it? • Not quite that simple • Need to do some networking specific things • e.g. conditional independence to model hot-potato routing • Can be solved using standard optimization toolkits • Taking advantage of sparseness of routing matrix A • Back to tomogravity • Conditional independence = generalized gravity model • Quadratic distance function is a first order approximation to the Kullback-Leibler divergence • Tomogravity is a first-order approximation to MMI AT&T Labs - Research
Results – Single example • ±20% bounds for larger flows • Average error ~11% • Fast (< 5 seconds) • Scales: • O(100) nodes AT&T Labs - Research
More results Large errors are in small flows >80% of demands have <20% error tomogravity method simple approximation AT&T Labs - Research
Other experiments • Sensitivity • Very insensitive to lambda • Simple approximations work well • Robustness • Missing data • Erroneous link data • Erroneous routing data • Dependence on network topology • Via Rocketfuel network topologies • Additional information • Netflow • Local traffic matrices AT&T Labs - Research
Dependence on Topology star (20 nodes) clique AT&T Labs - Research
Additional information – Netflow AT&T Labs - Research
Local traffic matrix (George Varghese) for reference previous case 0% 1% 5% 10% AT&T Labs - Research
Conclusion • We have a good estimation method • Robust, fast, and scales to required size • Accuracy depends on ratio of unknowns to measurements • Derived from principle • Approach gives some insight into other methods • Why they work – regularization • Should provide better idea of the way forward • Additional insights about the network and traffic • Traffic and network are connected • Implemented • Used in AT&T’s NA backbone • Accurate enough in practice AT&T Labs - Research
Results • Methodology • Use netflow based partial (~80%) traffic matrix • Simulate SNMP measurements using routing sim, and y = Ax • Compare estimates, and true traffic matrix • Advantage • Realistic network, routing, and traffic • Comparison is direct, we know errors are due to algorithm not errors in the data • Can do controlled experiments (e.g. introduce known errors) • Data • One hour traffic matrices (don’t need fine grained data) • 506 data sets, comprising the majority of June 2002 • Includes all times of day, and days of week AT&T Labs - Research
Robustness (input errors) AT&T Labs - Research
Robustness (missing data) AT&T Labs - Research
Point-to-multipoint We don’t see whole Internet – What if an edge link fails? Point-to-point traffic matrix isn’t invariant AT&T Labs - Research
Point-to-multipoint • Included in this approach • Implicit in results above • Explicit results worse • Ambiguity in demands in increased • More demands use exactly the same sets of routes • use in applications is better Link failure analysis Point-to-multipoint Point-to-point AT&T Labs - Research
Independent model AT&T Labs - Research
peer links access links Conditional independence • Internet routing is asymmetric • A provider can control exit points for traffic going to peer networks AT&T Labs - Research
peer links access links Conditional independence • Internet routing is asymmetric • A provider can control exit points for traffic going to peer networks • Have much less control of where traffic enters AT&T Labs - Research
Conditional independence AT&T Labs - Research
Minimum Mutual Information (MMI) • Mutual Information I(S,D)=0 • Information gained about S from D • I(S,D) = relative entropy with respect to independence • Can also be given by Kullback-Leibler information divergence • Why this model • In the absence of information, let’s assume no information • Minimal assumption about the traffic • Large aggregates tend to behave like overall network? AT&T Labs - Research
Dependence on Topology * These are not the actual networks, but only estimates made by Rocketfuel AT&T Labs - Research
Bayesian (e.g. Tebaldi and West) • J(x) = -log(x), where (x) is the prior model • MLE (e.g. Vardi, Cao et al, …) • In their thinking the prior model generates extra constraints • Equally, can be modeled as a (complicated) penalty function • Uses deviations from higher order moments predicted by model AT&T Labs - Research
Acknowledgements • Local traffic matrix measurements • George Varghese • PDSCO optimization toolkit for Matlab • Michael Saunders • Data collection • Fred True, Joel Gottlieb • Tomogravity • Albert Greenberg and Nick Duffield AT&T Labs - Research