1 / 21

Network Sensitivity to Hot-Potato Disruptions

Network Sensitivity to Hot-Potato Disruptions. Renata Teixeira ( UC San Diego ) http://www-cse.ucsd.edu/~teixeira with Aman Shaikh (AT&T), Tim Griffin(Intel), and Geoff Voelker (UCSD). SIGCOMM’04 – Portland, OR. UCSD. AT&T . AOL. Verio. Sprint. interdomain routing (BGP).

colton
Download Presentation

Network Sensitivity to Hot-Potato Disruptions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Network Sensitivity to Hot-Potato Disruptions Renata Teixeira (UC San Diego) http://www-cse.ucsd.edu/~teixeira with Aman Shaikh (AT&T), Tim Griffin(Intel), and Geoff Voelker (UCSD) SIGCOMM’04 – Portland, OR

  2. UCSD AT&T AOL Verio Sprint interdomain routing (BGP) intradomain routing (OSPF,IS-IS) Changes in one AS may impact traffic and routing in other ASes End-to-end performance depends on all ASes along the path Internet Routing Architecture Web Server User

  3. dst multiple connections to the same peer 10 9 Hot-potato routing = route to closest egress point when there is more than one route to destination • All traffic from customer to peers • All traffic to customer prefixes • with multiple connections Hot-Potato Routing New York San Francisco ISP network Dallas

  4. dst - failure - planned maintenance - traffic engineering 10 9 11 11 Hot-Potato Disruption New York San Francisco ISP network Routes to thousands of destinations switch exit point!!! Dallas

  5. Consequences of Hot-Potato Disruptions • Transient forwarding instability • Up to three minutes convergence delay • Normal internal changes take a couple of seconds • Traffic shift • Responsible for largest traffic matrix variations • Interdomain routing changes • Around 2 – 5% of a router’s external BGP updates

  6. What to do about it? • Engineer network to minimize disruptions • Network operator: operational practices to avoid changes • Network designer: designs that minimize sensitivity • Need a vocabulary and metrics to evaluate impact of internal changes • Compare possible network designs • Identify critical events • Take special care during maintenance or traffic engineering

  7. Modeling Hot-Potato Routing • Model of egress selection in backbone networks • Internal topology and link weights • Set of egress routers for each destination prefix • Apply topology changes • Link or router failures • Link weight changes • Evaluate impact of topology changes • For a router what fraction of prefixes shifts • Most critical link failure • …

  8. B A Region of A Region of B Egress set for a destination prefix (dst) = set of border nodes that learn routes to dst ({A,B}) Region of egress node A = nodes that are closer to A than B Modeling Egress Selection dst B 9 A 4 3 8 D 3 10 4 G 11 E F 8 6 5 C

  9. dst B 9 A 4 3 8 D 3 10 4 G 11 E F 8 6 5 C Region of A Region of A Region of B Region of B Topology change = edge or node deletion, link weight change C shifts from region of A to B Modeling Topology Changes dst B 9 A 4 3 8 D 3 10 4 G 11 E F 8 6 5 C

  10. Routing-shift at C when CF is deleted = 10,000/15,000 (i.e. 2/3) Y (1,000 prefixes) Generalizing to All Prefixes • Fraction of prefixes at a router that change egresses after a single topology change • Routing-shift function (HRM) X (10,000 prefixes) 9 B B A A 4 D 3 6 10 3 4 G 11 E F 8 6 5 Z (4,000 prefixes) C

  11. routing-shift function fraction of prefixes at C that changes egress after the failure of link CF: 2/3 C failure of CF All Prefixes, Routers, and Topology Changes routers topology changes

  12. Node Routing Sensitivity Metrics (RM) • Node routing sensitivity • Expected fraction of route shifts experienced by a node • Worst case • Maximum route shift experienced by a node routers C topology changes

  13. Routing Impact of a Graph Transformation (RM) • Impact of graph transformations • Average fraction of route shifts across all nodes • Worst case • Maximum route shift caused by each graph transformation routers failure of CF topology changes

  14. Case Study: A Large ISP Backbone Network • Obtaining input for the model • Topology – intradomain routing messages • Egress sets – collection of BGP tables • Set of graph transformations • Single link failures • Single router failures • Probability distribution for graph transformations • Uniform

  15. router failures link failures Routing Impact of Failures Most failures cause no hot-potato disruptions Operators can focus on most disruptive failures fraction of failures Which failures are most disruptive? routers Order failures according to average impact single router failures

  16. router failures link failures Very few hot-potato changes on average, but there are many failures that cause no shift Node Routing Sensitivity fraction of routers Which routers are most sensitive? Order routers according to average sensitivity routers High variance among routers single router failures

  17. Very disruptive failures for some routers Worst Case Node Routing Sensitivity fraction of routers What is the largest routing shift for each router? Order routers according to worst case sensitivity routers single router failures or single link failures

  18. Conclusion • Contributions • Model of hot-potato disruptions • Basis for a sensitivity analysis tool • Robustness should be a first-order metric • As important as traditional performance metrics • Network should have small reactions to small changes • Two approaches • Engineer the system: our model • Redesign routing interaction: on-going work

  19. Single Link vs. Single Router Failures dst B 20 10 A 20 10 D C 1000 10 E

  20. Single Link vs. Single Router Failures dst B 20 10 A 20 10 D C 1000 10 E

  21. Minimizing Disruptions 5 5 • Reconfiguration of routing protocols • Link and node redundancy • Selection of peering locations 4 10 10 10

More Related