1 / 29

Server-based Characterization and Inference of Internet Performance

Server-based Characterization and Inference of Internet Performance. Venkat Padmanabhan Lili Qiu Helen Wang Microsoft Research UCLA/IPAM Workshop March 2002. Outline. Overview Server-based characterization of performance Server-based inference of performance Passive Network Tomography

Samuel
Download Presentation

Server-based Characterization and Inference of Internet Performance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Server-based Characterization and Inference of Internet Performance Venkat Padmanabhan Lili Qiu Helen Wang Microsoft Research UCLA/IPAM Workshop March 2002

  2. Outline • Overview • Server-based characterization of performance • Server-based inference of performance • Passive Network Tomography • Summary and future work

  3. Overview • Goals • characterize end-to-end performance • infer characteristics of interior links • Approach: server-based monitoring • passive monitoring  relatively inexpensive • enables large-scale measurements • diversity of network paths

  4. Web server ACKs ACKs DATA clients

  5. Research Questions • Server-based characterization of end-to-end performance • correlation with topological metrics • spatial locality • temporal stability • Server-based inference of internal link characteristics • identification of lossy links

  6. Related Work • Server-based passive measurement • 1996 Olympics Web server study (Berkeley, 1997 & 1998) • characterization of TCP properties (Allman 2000) • Active measurement • NPD (Paxson 1997) • stationarity of Internet path properties (Zhang et al. 2001)

  7. Experiment Setting • Packet sniffer at microsoft.com • 550 MHz Pentium III • sits on spanning port of Cisco Catalyst 6509 • packet drop rate < 0.3% • traces up to 2+ hours long, 20-125 million packets, 50-950K clients • Traceroute source • sits on a separate Microsoft network, but all external hops are shared • infrequent and in the background

  8. Topological Metrics and Loss Rate Topological distance is a poor predictor of packet loss rate. All links are not equal  need to identify the lossy links

  9. Spatial Locality • Do clients in the same cluster see similar loss rates? • Loss rate is quantized into buckets • 0-0.5%, 0.5-2%, 2-5%, 5-10%, 10-20%, 20+% • suggested by Zhang et al. (IMW 2002) • Focus on lossy clusters • average loss rate > 5% Spatial locality there may be shared cause for packet loss

  10. Temporal Stability • Loss rate again quantized into buckets • Metric of interest: stability period (i.e., time until transition into new bucket) • Median stability period ≈ 10 minutes • Consistent with previous findings based on active measurements

  11. Putting it all together • All links are not equal  need to identify the lossy links • Spatial locality of packet loss rate  lossy links may well be shared • Temporal stability  worthwhile to try and identify the lossy links

  12. Passive Network Tomography • Goal:determine characteristics of internal network links using end-to-end, passive measurements • We focus on the link loss rate metric • primary goal: identifying lossy links • Why is this interesting? • locating trouble spots in the network • keeping tabs on your ISP • server placement and server selection

  13. Web server Why is it so slow? AT&T Sprint C&W Earthlink UUNET Darn, it’s slow! AOL Qwest

  14. Related Work • MINC (Caceres et al. 1999) • multicast-based active probing • Striped unicast (Duffield et al. 2001) • unicast-based active probing • Passive measurement (Coates et al. 2002) • look for back-to-back packets • Shared bottleneck detection • Padmanabhan 1999, Rubenstein et al. 2000, Katabi et al. 2001

  15. S A B Active Network Tomography S A B Striped unicast probes Multicast probes

  16. Problem Formulation server Collapse linear chains into virtual links (1-l1)*(1-l2)*(1-l4) = (1-p1) (1-l1)*(1-l2)*(1-l5) = (1-p2) … (1-l1)*(1-l3)*(1-l8) = (1-p5) Under-constrained system of equations l1 l3 l2 l4 l5 l6 l7 l8 p1 p2 p3 p4 p5 clients

  17. #1: Random Sampling • Randomly sample the solution space • Repeat this several times • Draw conclusions based on overall statistics • How to do random sampling? • determine loss rate bound for each link using best downstream client • iterate over all links: • pick loss rate at random within bounds • update bounds for other links • Problem: little tolerance for estimation error server l1 l3 l2 l4 l5 l6 l7 l8 p1 p2 p3 p4 p5 clients

  18. #2: Linear Optimization Goals • Parsimonious explanation • Robust to estimation error Li = log(1/(1-li)), Pj = log(1/(1-pj)) minimize Li + |Sj| L1+L2+L4 + S1 = P1 L1+L2+L5 + S2 = P2 … L1+L3+L8 + S5 = P5 Li >= 0 Can be turned into a linear program server l1 l3 l2 l4 l5 l6 l7 l8 p1 p2 p3 p4 p5 clients

  19. #3: Bayesian Inference • Basics: • D: observed data • sj: # packets successfully sent to client j • fj: # packets that client j fails to receive • Θ: unknown model parameters • li: packet loss rate of link i • Goal: determine the posterior P(Θ|D) • inference is based on loss events, not loss rates • Bayes theorem • P(Θ|D) = P(D|Θ)P(Θ)/∫P(D|Θ)P(Θ)dΘ • hard to compute since Θ is multidimensional server l1 l3 l2 l4 l5 l6 l7 l8 (s1,f1) (s2,f2) (s3,f3) (s4,f4) (s5,f5) clients

  20. Gibbs Sampling • Markov Chain Monte Carlo (MCMC) • construct a Markov chain whose stationary distribution is P(Θ|D) • Gibbs Sampling: defines the transition kernel • start with an arbitrary initial assignment of li • consider each link i in turn • compute P(li|D) assuming lj is fixed for j≠i • draw sample from P(li|D) and update li • after burn-in period, we obtain samples from the posterior P(Θ|D)

  21. Gibbs Sampling Algorithm 1) Initialize link loss rates arbitrarily 2) For j = 1 : burn-in for each link i compute P(li|D, {li’}) where li is loss rate of link i, and {li’} = ji lj 3) For j = 1 : realSamples for each link i compute P(li|D, {li’}) Use all the samples obtained at step 3 to approximate P(|D)

  22. Experimental Evaluation • Simulation experiments • Internet traffic traces

  23. Simulation Experiments • Advantage: no uncertainty about link loss rate • Methodology • Topologies used: • randomly-generated: 20 - 3000 nodes, max degree = 5-50 • real topology obtained by tracing paths to microsoft.com clients • randomly-generated packet loss events at each link • a fraction fof the links are good, and the rest are “bad” • LM1: good links: 0 – 1%, bad links: 5 – 10% • LM2: good links: 0 – 1%, bad links: 1 – 100% • Goodness metrics: • Coverage: # correctly inferred lossy links • False positives: # incorrectly inferred lossy links

  24. Simulation Results

  25. Simulation Results

  26. Simulation Results High confidence in top few inferences

  27. Trade-off

  28. Internet Traffic Traces • Challenge: validation • Divide client traces into two: tomography set and validation set • Tomography data set => loss inference • Validation set => check if clients downstream of the inferred lossy links experience high loss • Results • false positive rate is between 5 – 30% • likely candidates for lossy links: • links crossing an inter-AS boundary • links having a large delay (e.g. transcontinental links) • links that terminate at clients • example lossy links: • San Francisco (AT&T)  Indonesia (Indo.net) • Sprint  PacBell in California • Moscow  Tyumen, Siberia (Sovam Teleport)

  29. Summary • Poor correlation between topological metrics & performance • Significant spatial locality and temporal stability • Passive network tomography is feasible • Tradeoff between computational cost and accuracy • Future directions • real-time inference • selective active probing • Acknowledgements: • MSR: Dimitris Achlioptas, Christian Borgs, Jennifer Chayes, David Heckerman, Chris Meek, David Wilson • Infrastructure: Rob Emanuel, Scott Hogan http://www.research.microsoft.com/~padmanab

More Related