330 likes | 498 Views
Internet Routing: BGP Routing Convergence. Jennifer Rexford Princeton University http://www.cs.princeton.edu/~jrex/bgp-tutorial. Goals of This Section. BGP routing changes Detecting failures Path exploration Reducing convergence time Route flap damping and lower timer values
E N D
Internet Routing:BGP Routing Convergence Jennifer Rexford Princeton University http://www.cs.princeton.edu/~jrex/bgp-tutorial
Goals of This Section • BGP routing changes • Detecting failures • Path exploration • Reducing convergence time • Route flap damping and lower timer values • Favoring stability, root-cause tags, extra routes • BGP stability • Stable paths problem and policy conflicts • Policy guidelines that ensure stability • Active research areas on Internet routing • Location/identifier separation, routing servers, multipath routing, overlays and network virtualization
Causes of BGP Routing Changes • Topology changes • Equipment going up or down • Deployment of new routers or sessions • BGP session failures • Due to equipment failures, maintenance, etc. • Or, due to congestion on the physical path • Changes in routing policy • Changes in preferences in the routes • Changes in whether the route is exported • Persistent protocol oscillation • Conflicts between policies in different ASes
BGP Session Failure • BGP runs over TCP • BGP only sends updates when changes occur • TCP doesn’t detect lost connectivity on its own • Detecting a failure • Keep-alive: 60 seconds • Hold timer: 180 seconds • Reacting to a failure • Discard all routes learned from the neighbor • Send new updates for any routes that change AS1 AS2
Routing Change: Before and After 0 0 (2,0) (2,0) (1,0) (1,2,0) 1 1 2 2 (3,2,0) (3,1,0) 3 3
Routing Change: Path Exploration • AS 1 • Delete the route (1,0) • Switch to next route (1,2,0) • Send route (1,2,0) to AS 3 • AS 3 • Sees (1,2,0) replace (1,0) • Compares to route (2,0) • Switches to using AS 2 0 (2,0) (1,2,0) 1 2 (3,2,0) 3
0 Routing Change: Path Exploration (2,0) (2,1,0) (2,3,0) (2,1,3,0) (1,0) (1,2,0) (1,3,0) • Initial situation • Destination 0 is alive • All ASes use direct path • When destination dies • All ASes lose direct path • All switch to longer paths • Eventually withdrawn • E.g., AS 2 • (2,0) (2,1,0) • (2,1,0) (2,3,0) • (2,3,0) (2,1,3,0) • (2,1,3,0) null 1 2 3 (3,0) (3,1,0) (3,2,0)
BGP Converges Slowly • Path vector avoids count-to-infinity • But, ASes still must explore many alternate paths • … to find the highest-ranked path that is still available • Fortunately, in practice • Most popular destinations have very stable BGP routes • And most instability lies in a few unpopular destinations • Still, lower BGP convergence delay is a goal • Can be tens of seconds to tens of minutes • High for important interactive applications • … or even conventional application, like Web browsing
Existing Solution: Tune MRAI TImer • Minimum route advertisement interval (MRAI) • Minimum spacing between announcements • For a particular (prefix, peer) pair • Advantages of large MRAI • Provides a rate limit on BGP updates • Allows grouping of updates within the interval • Disadvantages of large MRAI • Adds delay to the convergence process • E.g., 30 seconds for each step • Trade-off overhead for convergence time
4000 Suppress limit 3000 Penalty 2000 Reuse limit 1000 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Time Network Announced Network Not Announced Network Re-announced Existing Solution: Route-Flap Damping • Identify (prefix, next-hop) that changes often • Suppress route until stable for a period of time • Problematic in practice • Path exploration can inadvertently trigger RFD • May suppress all routes, leaving no route left
Proposed: Preferring More Stable Routes • Alternative to route-flap damping • Score routes on how stable they are • E.g., time elapsed since the last change • Incorporate into the path-selection decision • Prefer more stable routes over less stable routes • Advantages • Always select a route, if one is available • Prevents excessive routing changes • Creates incentives for greater stability • Disadvantages • Leads to non-determinism in route selection • Requires state for each route
Proposed: Root-Cause Tagging • Identify reason for changing the route • E.g., which node or edge failed • Allow routers to skip routes with same fate • E.g., routes with same node or edge in AS path • Practical challenges • Multiple routers orlinks per AS • Incremental deployment 1 4 d 2 s 5 3
Proposed: Disseminating Backup Routes • Disseminating extra (backup) routes • So a route is available after a failure • To enable faster forwarding convergence 3 2 1 d 3 4 5 d • Announce alternate route to neighbor • AS 3 makes “3 4 5 d” available to 2 • AS 2 makes “2 3 4 5 d” to AS 1 • So ASes can switch immediately 3 2 4 2 1 d 1 5 1 d d
2 3 d 2 d 2 1 2 d 1 d d 1 3 1 d 3 d 3 Stable Paths Problem (SPP) Model • Model of routing policy • Each AS has a ranking of the permissible paths • Model of path selection • Pick the highest-ranked path consistent with neighbors • Flexibility is not free • Global system may not converge to a stable assignment • Depending on the way the ASes rank their paths
Better choice! Only choice! Top choice! Better choice! Only choice! Only choice! Permanent Oscillation: “Bad Gadget” 1 2 0 1 0 1 0 3 1 0 3 0 2 3 0 2 0 3 2 Pick the highest-ranked path consistent with your neighbors’ choices.
Two Stable Solutions: Disagree • Each AS prefers the path through the other • Two stable states • AS 2 picks “2 0”, and AS 1 picks “1 2 0” • AS 1 picks “1 0”, and AS 2 picks “2 1 0” • Outcome depends on timing/ordering of messages 1 2 0 1 0 2 1 0 2 0 1 2 0
Ways to Achieve Global Stability • Detect conflicting rankings of paths? • Computationally intractable (NP-hard) • Requires global coordination • Restrict the policy programming languages? • In what way? How to require this globally? • What if the world should change, and the protocol can’t? • Rely on economic incentives? • Policies typically driven by business relationships • E.g., customer-provider and peer-peer relationships • Sufficient conditions to guarantee unique, stable solution
Bilateral Business Relationships • Provider-Customer • Customer pays provider for access to the Internet • Peer-Peer • Peers carry traffic between their respective customers 1 Valid paths: “6 4 3 d” and “8 5 d” Invalid paths: “6 5 d” and “1 4 3 d” Valid paths: “1 2 d” and “7 d” Invalid path: “5 8 d” 4 3 2 d 5 6 Provider-Customer 7 Peer-Peer 8
Act Locally, Prove Globally • Global topology • Provider-customer relationship graph is acyclic • Peer-peer relationships between any pairs of ASes • Route export • Do not export routes learned from a peer or provider • … to another peer or provider • Route selection • Prefer routes through customers • … over routes through peers and providers • Guaranteed to converge to unique, stable solution
Rough Sketch of the Proof • Two phases • Walking up the customer-provider hierarchy • Walking down the provider-customer hierarchy 1 4 3 2 d 5 6 Provider-Customer 7 Peer-Peer 8
Trade-offs Between Assumptions • Three kinds of assumptions • Route export, route selection, global topology • Relax one assumption, need to tighten other two • Extensions for other kinds of relationships • Backups, siblings, … • But, many questions remain • Complete understanding of the trade-offs • Business practices may change over time • ASes may lie about their paths • Protocol extensions for multi-path routing
Why Change Routing? • Better performance • Scalability, security, convergence, reliability, flexibility, stability, … • Simpler management • For network operators • For folks deploying services • Greater extensibility • To enable experimentation • To enable new services
What to Change, and Where? • Add another layer about network routing • Routing functionality in overlay networks • Change the routing protocols • To improve scalability, security, convergence, … • Change the division of functionality • Data, control, and management planes • Change the division of responsibility • End users, third parties, and service providers • ???
Theme: Location/Identity Separation • Scalability problems with BGP • 300,000 prefixes and growing • Difficult in handling mobility • Idea: separate location and identity • Identity associated with a host or group of hosts • Location is “looked up” when sending packets • Examples • Route packets based on destination AS • Route packets based on “label” found in DNS • Establish e2e paths and associate with labels
Theme: Separating Routing From Routers • Today’s routers do many things • Compute routes, forward packets, monitoring • Separate service for computing routes • Better scalability, network-wide view, … • Several deployment scenarios • Within an AS • Incrementally deployable • Use BGP to instruct the routers • Across multiple ASes • Routing as a Service • Provided by third parties Server Server AS 2
Theme: Multipath Routing • Benefits of multipath routing • Efficiency, performance, reliability, and security • Greater control to users and edge ASes • Many ways to construct multiple paths • Multipath extensions to BGP • Overlays on top of BGP • Stitching together sub-paths • Source routing • Many new challenges • Scalability • Stable load balancing • Incentives for participation d
Theme: Overlays and Virtualization • Build end-to-end topologies • Overlays by tunneling from one node to another • Virtual networks by “hosting” overlays on the routers • Separation of “interdomain” issues • Instantiate a (virtual) topology over the infrastructure • Run (intradomain) routing protocols on this topology Competing ISPs with different goals must coordinate Single service provider controls end-to-end path
Conclusion • Internet routing • A competitive cooperation of ~40,000 networks • Policy-based path-vector routing protocol on prefixes • Tension between local autonomy and global properties • Many important practical challenges • Scalability, stability, flexibility, performance, reliability, … • Many interesting research directions • Understanding today’s BGP • Extensions and enhancements to BGP • Entirely new Internet routing architectures • Please, please help us fix interdomain routing!!!