1 / 24

End-to-End Detection of Shared Bottlenecks

End-to-End Detection of Shared Bottlenecks. Sridhar Machiraju and Weidong Cui Sahara Winter Retreat 2003. Problem Statement. Given 2 end-to-end flows f1 and f2 , do they share a bottleneck (a congested link i.e., link with packet drops) (OR)

Download Presentation

End-to-End Detection of Shared Bottlenecks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. End-to-End Detection of Shared Bottlenecks Sridhar Machiraju and Weidong Cui Sahara Winter Retreat 2003

  2. Problem Statement • Given 2 end-to-end flows f1 and f2, do they share a bottleneck (a congested link i.e., link with packet drops) (OR) • Given 2 routes R1 and R2 on the Internet, do they share a bottleneck link?

  3. Why is this hard? • No information from the network • Only information available – delay and drops. • Lots of noise – delay from intermediate links and drops on other links • Bottlenecks may change over time

  4. Why solve this problem? • Overlays – • RON - Decide if rerouting flows bypasses congestion points or not • RON – Does such rerouting affect existing flows? Which ones? • Cooperative overlays – overlay does not want to share bottleneck with a “friendly overlay” • OverQoS – Useful to cluster together overlay links based on shared bottlenecks

  5. Why solve this problem (cont.)? • Other applications • Massive backups of data from different servers – do them in parallel? • Content distribution – is the use of multipath going to improve performance? • Kazaa – parallel downloads from peers • Multihomed ASs can evaluate the “orthogonality” in terms other than fault-tolerance

  6. Related Work • Past work done only with Y or Inverted-Y topologies using Poisson probes, packet pairs and inter-arrival times. Senders Receivers

  7. Goals • Provide a general solution for double-Y topology • Work with multiple bottlenecks and provide an indicator of shared congestion • Be able to use active probe flows and also passively observed (TCP) flows • Complexity issues for clustering flows

  8. Motivation of Our Techniques • Droptail queues + TCP – queues exhibit bursty loss periods + no losses • Queues build-up until bursty losses and decrease in sizes before increasing again • Provides motivation for correlating periods of drops and delays (proportional to queue sizes) • But…

  9. 3 4 5 6 7 8 0 1 2 T Sender 1 Flow 1 3 4 5 6 7 0 1 2 3 4 5 6 0 1 2 Sender 2 Flow 2 3 4 0 1 2 Time 0 d1 d2+ Note: is bounded by RTTmax/2 Synchronization Lag = 3T Synchronization Lag

  10. Overview of Our Techniques • We propose 2 techniques – • Probability Distribution (PD) technique • Cross-Correlation (CC) technique • PD is based on getting the peak of the discrete probability distribution of, minimum time between drop of a flow and drop of the other • CC is based on getting the maximum cross-correlation assuming various synch. lags

  11. PD Technique • For each dropped packet of a flow, plot PD of minimum of the time differences between its sending time and the sending times of dropped packets of other flow • If shared bottleneck, we expect (ideally) a 1 at d2-d1+ ; All flows may not see drops during same burst, so use threshold < 1 for peak • We may see more than 1 drop in a burst; cluster drops into bursts and use time differences between starts of bursts

  12. Packet Loss Delay1 Delay2 PD technique (contd.) • Robustness issues: synch. lag must be smaller than the time difference between consecutive drops of a flow

  13. Network Cross-Correlation (CC) Technique • Key ideas • Two “back-to-back” packets from two different flows will experience similar packet drop/delay at the bottleneck • If we can generate two sequences of “back-to-back” packets from two different flows, then we can calculate their cross-correlation coefficient of losses or delays to measure their “similarity”. • If the cross-correlation coefficient is greater than some threshold, then the two flows share a bottleneck.

  14. Questions about the CC Technique • How to generate two sequences of “back-to-back” packets? • UDP probes with a constant interval T • average interval <= T/2 • Shift the sequence to overcome the synch. lag • How long should the two sequences be to get a significant result? • When the CC coefficient becomes relatively stable • But no less than a minimum period of time • What should the threshold be? • Use 0.1 in the experiments • Why 0.1?

  15. Packet Loss Delay1 Shift 2 packets Delay2 Overcome the Synchronization Problem • Find the max cross-correlation by shifting one of the two sequences within some range • The value of the optimal shift is an estimation of the synchronization lag.

  16. Wide-Area Experiments • Challenges • Access to hosts distributed globally? • How to verify our experimental results? • Solutions • PlanetLab (http://www.planet-lab.org) • Set up an overlay network with double-Y topology • Application-level routers monitor losses and delays

  17. Topology with Shared Bottleneck (I) Vancouver Bologna Seattle Wisc Atlanta Sydney

  18. Topology without Shared Bottleneck (II) Vancouver Bologna Seattle Wisc Atlanta Sydney

  19. Experimental Setup • Active Probing • 40 bytes per packet • Every 10ms • Log packet arrival times on every node • Also can get information of losses from these logs • Traces from 10mins to 60mins • Threshold = 0.1 for the PD and CC techniques

  20. Overall Results Failed Cases

  21. Why the Delay CC Technique fails? • Delay spikes at the non-shared part.

  22. Why the PD Technique fails? • Large synchronization lag • Few number of drops at the bottleneck

  23. Open Issues • Parameter Selection • What should the thresholds be? • Active vs. Passive Probing • Active probing: waste network resources • Passive probing: cannot control the size/rate of the probing sequences. • Multiple Bottlenecks • Our techniques are not limited to the cases of single bottlenecks. • But need more quantitative evaluations • Probability of sharing a bottleneck • How often should we generate probing sequence to detect if two flows share a bottleneck? • Can we give a probability rather than a 0-1 decision?

  24. Conclusions • Problem • Detect if 2 end-to-end flows share a bottleneck • Challenge • Synchronization lag in double-Y topology • Techniques • The Probability Distribution Technique • The Loss/Delay Cross-Correlation Technique • Experimental Results • The Loss CC technique succeeds with all experiments • The Delay CC technique fails in some experiments due to delay spikes at the non-shared part • The PD technique fails in some experiments due to large synch. Lag and few number of losses at the bottleneck

More Related