100 likes | 303 Views
Delayed Internet Routing Convergence due to Flap Dampening. Z. Morley Mao Ramesh Govindan, Randy Katz, George Varghese zmao@eecs.berkeley.edu. Slow Internet routing convergence. BGP is a path-vector protocol Convergence can be O(n!)
E N D
Delayed Internet Routing Convergence due to Flap Dampening Z. Morley Mao Ramesh Govindan, Randy Katz, George Varghese zmao@eecs.berkeley.edu
Slow Internet routing convergence • BGP is a path-vector protocol • Convergence can be O(n!) • Multi-homed fail-over linear with longest backup path length • Can take up to 15 minutes • Why so slow? • Protocol effects: path vector protocol • Flap Damping can delay convergence! • Unexpected interference between two mechanisms of the routing protocol • Study this interaction and propose a solution to eliminate this undesired interaction
What is route flap dampening? • RFC2439, widely deployed • Goals: • Reduce router processing load caused by instability • Prevent sustained routing oscillations • Without sacrificing convergence times for well-behaved routes • Parameters: • Penalty, half-life, suppress-limit, reuse limit, maximum suppressed time
Exponentially decayed Suppress limit penalty Reuse limit time How does flap dampening work? • RIPE-229 recommendation: • Don’t damp until fourth flap • /24 or longer prefixes: max=min outage 60 min • /22, /23 prefixes: max outage=45min, min outage=30min • Other prefixes: max outage=30min, min outage=10min
Example topology: 1 2 4 3 Route withdraw convergence process Assuming node 1 has a route to a destination, it withdraws the route: Stage (msg processed) Msg queued 0: 1->[2,3,4]W 1: (1->2W) 1->[3,4]W, 2->[3,4]A[241] 2: (1->3W) 1->4W, 2->[3,4]A[241], 3->[2,4]A[341] 3: (1->4W) 2->[3,4]A[241], 3->[2,4]A[341], 4->[2,3]A[431] 4: (4->2A[431]) 2->[3,4]A[241], 3->[2,4]A[341], 4->[3]A[431] 5: (4->3A[431]) 2->[3,4]A[241], 3->[2,4]A[341] 6: (3->2A[341]) 2->[3,4]A[241], 3->[4]A[341] 7: (3->4A[341]) 2->[3,4]A[241] 8: (2->3A[241]) 2->[4]A[241], 9: (2->4A[241]) MinRouteAdver timer expires: 4->[2,3]W, 3->[2,4]A[3241], 2->[3,4]A[2431] … (omitted) Note: In responding to withdrawal from 1, node 3 sends out 3 messages: 3->[2,4]A[341], 3->[2,4]A[3241], 3->[2,4]W
Interaction btw. Flap damping and convergence Example topology: • Assume a node 5 is attached to 3, and after node 1 withdraws, it announces the route again • Node 5 can suppress the route from node 3! • A single flap is multiplied by 3, triggering route suppression • Convergence is further delayed! 1 2 4 3 5
Data analysis • Is the toy topology realistic? • Exchange points often have clique topologies • There are usually multiple backup paths • Evidence found in data analysis of real BGP updates • Example (from RIPE): BGP4MP|1009757425|A|202.12.29.64|4608|199.5.187.0/24|4608 1221 4637 701|IGP|202.12.29.64|0|0||NAG|| BGP4MP|1009757478|A|202.12.29.64|4608|199.5.187.0/24|4608 1221 4637 1 701|IGP|202.12.29.64|0|0||NAG|| BGP4MP|1009757505|A|202.12.29.64|4608|199.5.187.0/24|4608 1221 4637 7176 1 701|IGP|202.12.29.64|0|0||NAG|| BGP4MP|1009757531|W|202.12.29.64|4608|199.5.187.0/24
Simulations/Analysis • Simulation using SSFnet • Topologies • Toy topologies, e.g., cliques • Real AS graphs with commercial relationships • Analysis • Impact of flap damping on convergence • Properties of topologies to trigger this effect • Effect of policies • Decisions of provider selections and connectivity
Proposed solution • Redefine the definition of flap • Currently any route change is considered a flap • New definition • flap has to change direction of route degree of preference (dop) value, relative to the previous flap • Keep two additional bits (about dop comparison) • 00: undefined, 01: equal, 10: better, 11: worse • Convergence flap properties • Increasing Aspath lengths • Route value keeps increasing • Solution is currently evaluated using trace-driven simulation!
Conclusion/Future work • Route flap damping can interfere with BGP route convergence • Trades off convergence for stability • Interesting thought exercises: • Tradeoffs between convergence and stability • Flap Damping • How to infer the causes of flaps • How to prevent damping legitimate updates • Challenges: • Internet topology is less hierarchical • Multi-homing is growing