1 / 22

Magellan: A Tool for Unicast Fault Isolation

Magellan: A Tool for Unicast Fault Isolation. Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information Sciences Institute. Motivation. Why can't I reach www.cnn.com? Why is the Internet soooo slow today? It was fine yesterday!.

lukas
Download Presentation

Magellan: A Tool for Unicast Fault Isolation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information Sciences Institute

  2. Motivation • Why can't I reach www.cnn.com? • Why is the Internet soooo slow today? • It was fine yesterday!

  3. Goals • User's perspective • What is of interest to user • Internet wide routing monitoring • not just an AS • History of route changes • not just a snapshot • Fault diagnosis • link/router failure/repair

  4. Challenges • Scaling • Directed search by correlating destinations • Shared learning • Automated heuristics for fault isolation • Route change • Location of link/router failure/repair • Oscillations • Others?

  5. Data Collection • Select target's interesting to the user • tcpdump/libpcap • Weighting / aging (not implemented) • Initial path to targets • traceroute • Monitoring paths • Carefully constructed ICMP probes

  6. Snapshot

  7. Monitoring • Construct a routing graph • Nodes: routers • Links: (to, from, source, destination, hop, statistics...) • Probe each link • Send two ICMP Echo Request packets to destination • For ttl = hop - 1, hop, verify incident routers, to, from

  8. Scheduling Probes • WRR schedule a probe for each link • Limits the rate of probe packets • Weights: some links are more important/interesting • Distance to link • No of destinations using it • History of volatility • Exponentially averaged

  9. Test Result • Positive • Do nothing • Negative • Determine new path • Incremental traceroute from the link upstream and downstream • Determine cause • Automatic heuristics based

  10. Active Fault Isolation • Link failure • Probe the link using other destinations that uses it • Correlate results • Router failure • Generalize on link failure • Oscillations • History of old routes • Back and forth between a set of routes

  11. Magellan Components Magellan Nam • Visualization • Offline or real-time • Great for debugging/tuning Perl Script

  12. Snapshot • Link or router failure • I want the nam buttons, etc...

  13. Effectiveness thru Measurement • Picked 500 popular web sites • Yahoo, msn, aol, cnn, ... • www.web100.com • Monitored routes to these destinations for 7 days

  14. Measurements • Number of Link Probes: 839694 • Probe per second: 1.39 / second • Total Failures: 2078 • Router Failures: 334 • Link Failures: 951 • Unknown cause: 793 • Transients • Number of Oscillations: 541

  15. No of Path Changes

  16. Effect of Path Length

  17. Dominant Path

  18. Cumulative Dominant Path

  19. Future work: Distributed Magellan • Weight to probe inversely proportional to ratio of distances • Shared learning Magellan 1 Magellan 2

  20. Related Work • Topology Maps • Router/AS level interconnections • Mercator, skitter, AT&T • Not all links are usable (routing policy/metrics) • Routing Topology • Effect of policy/metrics • Npd Vern Paxson's work • Focus is on measurement

  21. Conclusions • Unicast fault isolation • User's perspective • Automated heuristics • History of changes • http://www.isi.edu/scan

More Related