Traffic Matrix Estimation: Existing Techniques and New Directions

Traffic Matrix Estimation: Existing Techniques and New Directions A. Medina (Sprint Labs, Boston University) , N. Taft (Sprint Labs), K. Salamatian (University of Paris VI), S. Bhattacharyya, C. Diot (Sprint Labs) Presented by Matthew Caesar

Problem scope • Environment: • Single ISP, provides SLAs to customers • Goal: Estimate traffic matrix • Amount of traffic flowing between each (origin, destination) pair • Hard to measure exactly (requires extensive logging and/or offline parsing) • Why would we want to know the traffic matrix? • Helps determine load balancing, routing protocols configuration, dimensioning, provisioning, failover strategies • Allows quantification of cost of providing QoS vs. overprovisioning

Solution idea • Main idea: • Measure utilization (“link count”) on each network link • Can be easily done in router fast path • Done via snmp query • Find a set of OD flows that would produce the measured link counts • Sticky issue: how to find the set of OD flows? • Three techniques: • Linear Programming (LP) • Bayesian estimation • Expectation Maximization (EM)

Traffic Estimation • Assumptions can be operator’s knowledge (eg. maybe some pairs are always zero) • Prior TM: sometimes need seed TM to start with • Routing Matrix • Link counts (link utilizations)

Problem setup • See whiteboard

Scheme #1: Linear Programming (LP) • Linear program: • Objective function + constraints • Main idea: • Try to maximize the total amount of traffic routed through the network • Given contraints: • Total traffic must be less than the measured link count • Flow conservation • Observations: • Leads to solutions where OD pairs with few intermediate hops will be assigned large amts of bandwidth, while more distant pairs will get much less bandwidth • Solution: put more weight on pairs separated by greater distances

Scheme #2: Bayesian Inference • See whiteboard

Scheme #3: Expectation Maximization (EM) • See whiteboard

Evaluation Method • Impossible to obtain “real” traffic matrix via direct measurement. • Therefore, use simulations • How to characterize flow between OD pairs? • Tried Constant, Poisson, Gaussian, Uniform and Bimodal (flash crowd) TMs

Results: Linear programming vs. Statistical methods • Linear programming method performs poorly • Assigns zero to many OD pairs, increasing error • Problem: tries to match OD pairs to link counts • Different objective functions give similar results •  error too high for use in practical networks • Bayesian and EM: • EM beats Bayesian in terms of average error and worst case error • Estimation errors correlated to heavily shared links (links with many OD flows are more likely to be mis-estimated)

Results: Goodness of prior • Goodness of prior matrix (seed values) • Bayesian is much more sensitive to the prior matrix than EM • However, EM is also quite sensitive • Perhaps because: EM method has deterministic convergence behavior (can be analyzed) while Bayesian has stochastic convergence (it oscillates) • After a certain point, additional measurements don’t provide additional gain • Measuring over long periods of time only gives small additional improvement

Results: Marginal gains • What improvement could be gained if we could measure some components of the traffic matrix directly? • Carrier may have the option to deploy a certain amount of monitoring equipment • 3 ways to add rows: • Randomly, row-sum (by traffic volume), and error magnitude • Results: • Error rate drops off roughly linearly with each additional row added • Bayesian not sensitive to order rows are added • EM does better when rows added by largest-error first •  reduction in adding a row is 2% for 13 OD pairs

Other results • Which OD pairs are most difficult to estimate? • Error increases as the link-sharing factor increases, also as path length increases • How to characterize OD flows? • Poisson and Gaussian assumption holds well, but only for certain hours during the day.

Recommendations • Network operators know a lot about their network. We need to devise methods to allow incorporation of network specific information into the estimation scheme. • We need a better model of OD flows through an ISP. • Possible solution: “gravity models” based on utility factor (see whiteboard) • We need a good way to generate good prior TMs.

References: Statistical INference: • http://ic.arc.nasa.gov/ic/projects/bayes-group/html/bayes-theorem-long.html • http://www.math.uah.edu/stat/prob/prob5.html • http://www.statisticalengineering.com/bayes_thinking.htm • http://www.stat.psu.edu/~jls/stat544/2001/lec22.pdf • http://www-eksl.cs.umass.edu/library/Statistics/Expectation-Maximization/ • http://www.owlnet.rice.edu/~msmiley/elec431/em.htm Traffic Matrix Estimation: • http://dimacs.rutgers.edu/Workshops/MiningTutorial/grossglauser-slides.ppt

Traffic Matrix Estimation: Existing Techniques and New Directions