1 / 22

Internet Routing Instability

Internet Routing Instability. Craig Labovitz G. Robert Malan Farnam Jahanian. Appeared: SIGCOMM ‘97. Presenters: Supranamaya Ranjan Mohammed Ahamed. Internet Structure. Many small ISP’s at lowest level. Small number of big ISP’s at core . The Core of the Internet.

adin
Download Presentation

Internet Routing Instability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Internet Routing Instability Craig Labovitz G. Robert Malan Farnam Jahanian Appeared: SIGCOMM ‘97 Presenters: Supranamaya Ranjan Mohammed Ahamed

  2. Internet Structure • Many small ISP’s • at lowest level • Small number of • big ISP’s at core

  3. The Core of the Internet Sprint Verio UUNet rice.edu • Routing done using BGP at core • Inter-domain routing could be RIP/OSPF etc

  4. BGP Overview 92.92.x.x Sprint 128.42.x.x 196.29.x.x 92.92.x.x Verio 100.100.x.x 196.29.x.x 128.42.x.x UUNet 196.29.x.x 100.100.x.x

  5. BGP Overview (contd.) • Path Vector protocol • Similar to Distance Vector routing • Loop detection done using AS_PATH field R1 R2 Peering session (TCP) • Exchange full routing table at start • Updates sent incrementally

  6. Key Point The volume of BGP messages exchanged is abnormally high • Most messages are redundant / unnecessary and do not • correspond to and topology or policy changes

  7. Consequence: Instability • Normal data packets handled by dedicated hardware • BGP packet processing consumes CPU time • Severe CPU processing overhead takes the router offline Route Flap Storm: B • Router A temporarily fails • When A becomes alive B & C • send full routing tables A • B & C fail…cascading effect C How do we avoid /lessen the impact of these problems?

  8. Route Dampening • Router does not accept frequent route updates to a • destination • Might signal that network has erratic connectivity • Increment counter for destination when route changes • Counter exceeds threshold stop accepting updates • Decrement counter with time Problem: • Future legitimate announcements are accepted only • after a delay

  9. Prefix Aggregation/Super-netting • Core router advertises a less specific network prefix • Reduces size of routing tables exchanged Problems: Prefix aggregation is not effective because: - Internet addresses largely non-hierarchically assigned - Domain renumbering not done when changing ISP’s - 25% of prefixes multi-homed - Multi-homed prefixes should be exposed at the core

  10. Route Servers • O(N) peering sessions per • Router • 1 peering session per router Route Server In-spite of all these measures the BGP message overhead is unexpectedly high

  11. Evaluation Methodology • Data from Route Server at M.A.E west (D.C) peering point • Peering point for more than 60 major ISP’s • Nine month log • Time series analysis of message exchange events

  12. Observation: Lot’s of redundant updates • Duplicate route with-drawls ISP Number of With-drawls Unique Ratio A 23276 4344 5 F 86417 12435 7 I 2479023 14112 175 One Reason: - Stateless BGP - No state of previous with-drawls maintained

  13. ISP infrastructure up-grade 24:00 Lesser messages 18:00 12:00 10:00 AM 6:00 Lesser messages Instability density with time Observation: Instability Proportional to Activity After removing duplicate messages: Time of day

  14. Number of instability events Evidence from Fine Grained Structure 7 days 24 hours Power spectral density Frequency (1/hour) Conjecture: BGP packets are competing with data packets during high bandwidth activity.

  15. WADiff AADiff Proportion of announcements Proportion of announcements Proportion of routing table Proportion of routing table WADup Proportion of announcements Proportion of routing table Observation: Instability & size uncorrelated • ISP’s serving more network prefixes • may not contribute more to instability

  16. Observation: Instability distributed over routes 75% median Cumulative proportion 10 # of announcements per prefix+AS • 20% to 90% of routes change 10 times or less • No single route contributes significantly to instability

  17. 30s AADiff Proportion 1min Inter Arrival Time distribution for AADiff’s Observation: Synchronized updates • Inter-arrival times of • updates shows periodicity • 30 s and 1 minute patterns • Some routers collect and send • Updates once every 30 s Possible reasons: • Routers get synchronized • Border router- Internal router: interaction misconfigured??

  18. End-to-end Perspective Chinoy: “Dynamics of Internet routing information” (SIGCOMM 93) Measurements on NSFNET showed: - Processing and forwarding latency of BDP update is 3 orders of magnitude more than the latency incurred in forwarding data packets - Will lead to packet drops during the intervening period Paxson: “End-to-End routing behavior in the internet” (SIGCOMM 96) • Routing loops introduce loops into other router’s routing tables • An end-to-end route changes every 1.5 hours on an average

  19. 0.14% ~ 0.065% ~ End-to-End perspective (Paxson) Pathology type Probability in 1995 Probability in 1996 same Long-lived Routing loops Short-lived Routing loops same Outage>30s 0.96% 2.2% Total 1.5% 3.4%

  20. Summary and Conclusions • Redundant routing information flows in core • Instability distributed across autonomous systems Possible reasons for instability: • Stateless BGP updates • Misconfigured routers • Synchronization • Clocks driving the links not synchronized (link “flaps”)

  21. Follow-up work & impact “Origins of Internet Routing Instability”-1999 • Migration from stateless to stateful BGP decreased duplicate withdrawals • by an order of magnitude • But Duplicate Announcements (AADup) doubled • Reason: Non-transitive attribute filtering not implemented - BGP specification: “never propagate non-transitive attributes”.. - ASPATH is transitive attribute - MED (Multi Exit Discriminator) is NOT transitive

  22. Propagating MED’s Causes Oscillations

More Related