1 / 17

Server-based Inference of Internet Performance V. N. Padmanabhan, L. Qiu, and H. Wang.

This paper discusses the diagnosis engine for identifying and troubleshooting slow internet performance. It proposes a passive network tomography approach using Gibbs sampling for inferring link loss rates and provides a comprehensive evaluation of the methodology.

espann
Download Presentation

Server-based Inference of Internet Performance V. N. Padmanabhan, L. Qiu, and H. Wang.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Server-based Inference of Internet PerformanceV. N. Padmanabhan, L. Qiu, and H. Wang. Modified by Yao Zhao

  2. Why is it so slow? Diagnosis engine It’s so slow! Motivation Web Server Ethernet AT&T C&W UUNet Sprint AOL Qwest Earthlink

  3. Diagnosis engine Network topology Trouble spots location Diagnosis results: Qwest access link: 63.232.180.230->63.232.33.134 Peering between UUNET and AOL: 64.45.216.154->172.139.89.74 Netmon/tcpdump traces Network Diagnosis

  4. Network Diagnosis (Cont.) • Goal: Determine internal network characteristics using passiveend-to-end measurements • Primary focus: identifying lossy links • Applications • Trouble shooting • Server selection • Server placement • Overlay network path construction

  5. S A B Previous Work • Active probing to infer link loss rate • multicast probes • striped unicast probes • Pros & cons • accurate since individual loss events identified • expensive because of extra probe traffic S A B

  6. Problem Formulation • (1-l1)*(1-l2)*(1-l4) = (1-p1) • (1-l1)*(1-l2)*(1-l5) = (1-p2) • … • (1-l1)*(1-l3)*(1-l8) = (1-p5) • Challenges: • Under-constrained system of equations • Measurement errors server l1 l3 l2 l4 l5 l6 l7 l8 clients p1 p2 p3 p4 p5

  7. 3 methods • Random sampling • Linear optimization • Bayesian Inference using Gibbs sampling (We’ll focus on the latter one)

  8. Random Sampling • Each sample: Randomly assign loss rate to links with constrains • Iterate R times and average server l1 l3 l2 l4 l5 l6 l7 l8 clients p1 p2 p3 p4 p5

  9. Linear Programming • Linear Equations • LP Constrains & • Optimization Goal

  10. Gibbs Sampling • D • observed packet transmission and loss at the clients •  • ensemble of loss rates of links in the network • Goal • determine the posterior distribution P(|D) • Approach • Use Markov Chain Monte Carlo with Gibbs sampling to obtain samples from P(|D) • Draw conclusions based on the samples

  11. Gibbs Sampling (Cont.) • Applying Gibbs sampling to network tomography • 1) Initialize link loss rates arbitrarily • 2) For j = 1 : warmup for each link i compute P(li|D, {li’}) where li is loss rate of link i, and {li’} = kI lk • 3) For j = 1 : realSamples for each link i compute P(li|D, {li’}) • Use all the samples obtained at step 3 to approximate P(|D)

  12. Performance Evaluation • Simulation experiments • Trace-driven validation

  13. Simulation Experiments • Advantage: no uncertainty about link loss rate! • Methodology • Topologies used: • randomly-generated: 20 - 3000 nodes, max degree = 5-50 • real topology obtained by tracing paths to microsoft.com clients • randomly-generated packet loss events at each link • A fraction f of the links are good, and the rest are “bad” • LM1: good links: 0 – 1%, bad links: 5 – 10% • LM2: good links: 0 – 1%, bad links: 1 – 100% • Link loss processes: Bernoulli and Gilbert • Goodness metrics: • Coverage: # correctly inferred lossy links • False positive: # incorrectly inferred lossy links

  14. Comparative Results • Random sampling generates many false positives • LP has a low cover rate (30-60%) • Gibbs performs very well (80% with a 5% false positive rate)

  15. Random topologies Confidence estimate for gibbs sampling works well and can be used to rank order the inferred lossy links.

  16. Trace-driven Validation • Validation approach • Divide client traces into two: tomography and validation • Tomography data set  loss inference • Validation set  check if clients downstream of the inferred lossy links experience high loss • Experimental setup • Real topologies and loss traces collected from traceroute and tcpdump at microsoft.com during Dec. 20, 2000 and Jan. 11, 2002 • Results • For the small subset of inferences that could be validated, all the inferences are correct • Likely candidates for lossy links: • links crossing an inter-AS boundary • links having a large delay (e.g. transcontinental links) • links that terminate at clients

  17. Summary • Passive network tomography is feasible • Gibbs sampling yields a high coverage (over 80%), and a low false positive rate (below 5-10%) • Future work: make loss inference in real time

More Related