470 likes | 604 Views
Understanding and Limiting BGP Instabilities Zhi-Li Zhang ( zhzhang@cs.umn.edu ) Jaideep Chandrashekar ( jaideepc@cs.umn.edu ) Kuai Xu (kxu@cs.umn.edu). BGP: Internet Glue. “Path-vector” routing protocol.
E N D
Understanding and Limiting BGP InstabilitiesZhi-Li Zhang (zhzhang@cs.umn.edu) Jaideep Chandrashekar (jaideepc@cs.umn.edu) Kuai Xu (kxu@cs.umn.edu)
BGP: Internet Glue • “Path-vector” routing protocol. • Allows networks to tell other networks about destinations that they are “responsible” for and how to reach them • Using “route advertisements”, also called “NLRI” or “network-layer reachability information”
BGP: Internet Glue (cont’d) • Policy-based: allow ISPs to richly express their routing policy, both in selecting outbound paths and in announcing internal routes • Relatively “simple” protocol, but configuration is complex and the entire world can see, and be impacted by, mis-configurations.
ASes & AS Numbers (ASNs) • An autonomous system is an independent routing domain that has been assigned an Autonomous System Number (ASN). • Currently over 15,000 in use. • 64512 through 65535 are “private” • Examples • AS57 U of Minnesota GigaPoP • AS217 U of Minnesota • AS701 UUNET • AS1239 Sprint • ASNs represent atoms of BGP routing policy.
Internet Connectivity of University of Minnesota AS 11537 Internet2 AS 217 UMN AS 1998 State of Minnesota AS 1 Genuity AS 7911 Wiltel AS 57 UMN GigaPoP 128.101.0.0/16
Architecture of Internet Routing ISIS BGP AS 1 IGP = Interior Gateway Protocol OSPF Metric based: OSPF, IS-IS, RIP AS 2 EGP = Exterior Gateway Protocol Policy based: BGP
Simplified BGP Operations Establish session on TCP port 179 AS1 BGP session Exchange all active routes AS2 While connection is ALIVE exchange route UPDATE messages Exchange incremental updates
Types of BGP Messages • Open : Establish a peering session. • Keep Alive : Handshake at regular intervals. • Notification : Shuts down a peering session. • Update : announce new routes or withdrawpreviously announced routes. Announcement :prefix + attribute values Withdrawals : prefix only
BGP Attributes Value Code Reference ----- --------------------------------- --------- 1 ORIGIN [RFC1771] 2 AS_PATH [RFC1771] 3 NEXT_HOP [RFC1771] 4 MULTI_EXIT_DISC [RFC1771] 5 LOCAL_PREF [RFC1771] 6 ATOMIC_AGGREGATE [RFC1771] 7 AGGREGATOR [RFC1771] 8 COMMUNITY [RFC1997] 9 ORIGINATOR_ID [RFC2796] 10 CLUSTER_LIST [RFC2796] 11 DPA [Chen] 12 ADVERTISER [RFC1863] 13 RCID_PATH / CLUSTER_ID [RFC1863] 14 MP_REACH_NLRI [RFC2283] 15 MP_UNREACH_NLRI [RFC2283] 16 EXTENDED COMMUNITIES [Rosen] ... 255 reserved for development Not all attributes need to be present in every announcement
Two Types of BGP Neighbor Relationships • External Neighbor (eBGP) in a different Autonomous Systems • Internal Neighbor (iBGP) in the same Autonomous System AS1 iBGP is routed (using IGP!) eBGP iBGP eBGP AS2 eBGP
iBGP updates iBGP Peers Must be Fully Meshed eBGP update • iBGP is needed to avoid routing loops within an AS • Injecting external routes into IGP does not scale and causes BGP policy information to be lost • BGP does not provide “shortest path” routing iBGP neighbors do not announce routes received via iBGP to other iBGP neighbors.
AS 1239 Sprint AS PATH Attribute AS 1129 135.207.0.0/16 AS Path = 1755 1239 7018 6341 Global Access AS 1755 135.207.0.0/16 AS Path = 1239 7018 6341 135.207.0.0/16 AS Path = 1129 1755 1239 7018 6341 Ebone AS 12654 RIPE NCC RIS project 135.207.0.0/16 AS Path = 7018 6341 AS7018 135.207.0.0/16 AS Path = 3549 7018 6341 135.207.0.0/16 AS Path = 6341 AT&T AS 3549 AS 6341 135.207.0.0/16 AS Path = 7018 6341 AT&T Research Global Crossing 135.207.0.0/16 Prefix Originated
Inter-domain Loop Prevention AS 7018 BGP at AS YYY will never accept a route with ASPATH containing YYY. Don’t Accept! 12.22.0.0/16 ASPATH = 1 333 7018 877 AS 1
BGP Best Path Selection • Ignore if exit point unreachable • Highest local preference • Lowest AS path length • Lowest origin type • Lowest MED (with same next hop AS) • Lowest IGP cost to next hop • Lowest router ID of BGP speaker
In a nutshell • BGP = Path Vector Protocol + Policies. • The Path vector protocol is very simple • Distribute Reachability. • Prevent Loops. • All the complexity is introduced by locally administered policies. • Determine which paths are selected. • And which neighbors they are exported to.
What is Path Exploration? • When a link fails (or is repaired), routers “go through” a sequence of paths before selecting a “converged” path. • Results from dependencies in advertised “path vectors”. • Router’s best path is an extension of a neighbors’ best path. • Which extends a best path from one of its own neighbors. • And so on……
What is Path Exploration (cont’d) • When a link fails, a set of dependent paths becomes invalid (or obsolete). • Removed one by one from the system. • Router selects and propagates it. • Receives withdrawal. • Selects next best path (possibly invalid). • Receive withdrawal, repeat till no more invalid paths.
Path Exploration example 4210 4 8 210 2 7 9 5 74210 75210 76310 0 1 10 3 6 310 Network in a steady state 6310
75210 76310 W Path Exploration Example (cont’d) 4210 4 210 8 2 7 9 5 74210 75210 76310 0 1 10 3 6 6310 310
Path Exploration Example (cont’d) • Paths 75210 and 76310 both contain the “problem edge” 10. • 2 additional messages to force 7 to flush “bad paths”. • Number of “spurious messages” increases with the “richness” of connectivity ….. 8 4 9 2 7 7210 74210 7510 75210 72510 …… 5 0 1 3 6
Impact of Path Exploration • In general, convergence time is O(LΔ) • ‘L’ is the longest simple path in the network. • ‘Δ’ is the time between successive announcements. • From measurements: up to 15 minutes to converge (after link failure).
Impact of Path Exploration (cont’d) • Delays a router from picking valid, alternate paths. • Have to first go through all the invalid paths. • Large scale packet losses in a short duration. • Core routers process millions of packets a second. • In the absence of path exploration, convergence time is Ω(Dh). • ‘D’ is “diameter” of the network (D << L) • ‘h’ is message processing time at a node.
Causes for Path Exploration • Invalid paths are selected, propagated, then withdrawn. • Routers waste time processing “stale information” • Delay convergence to valid, perhaps less preferred, alternate paths • Key Issue: How to distinguish invalid paths from valid” paths • Difficult in BGP: AS Paths --high level, abstract
AS PATHS: High Level Connectivity AS 3 AS 11536 AS 81 AS 1239 AS 217 AS 217 and AS 3 receive the same AS PATH [11536 1239 81] Underlying physical paths are disjoint.
4 2 7 5 0 1 3 6 Naive Solutions Fail. • TAG withdrawals: When router generates withdrawal, tag it with cause/location. WDRAW: (2,1) failed 74210 75210 76310
4 7 5 0 1 3 6 Naïve Solutions Fail (cont’d) • AS Paths do not describe (or reflect) internal AS topology. • When an internal edge fails, which AS Path affected? • [10] or [210]? 2
4 2 7 5 0 1 3.2 6.1 3.3 6.3 6.2 AS 3 AS 6 Naïve Solutions Fail (cont’d) • Link between 3.2 and 6.1 fails. • 6.1 generates a withdrawal and tags with <3,6> • Should 6.3 remove all paths containing <3,6> ?
EPIC --- A Simple Solution • Exploit Path dependencies to Invalidate Paths. • To avoid Path Exploration: • When link fails, a set of dependent paths becomes invalid. • All the dependent paths must be removed from the system. • Dependent paths cannot be described using only AS Paths. • AS Paths are annotated with additional information (forward edge sequence numbers). • Can capture path dependencies. • Can distinguish valid and invalid paths.
Forward Edge Sequence Numbers • When AS Path being advertised to an external AS neighbor, include fesn of “forward” external edge. • fesn = edge identifier + sequence number Edge <X,Y> AS X AS Y
Forward Edge Sequence Numbers (cont’d) • Defined per destination, for every AS-AS edge. • When AS X sends a route to AS Y, the fesn (X:Y, n) is attached; • If route already has a previously attached fesn, new fesn is prepended to it ---- fesnList. AS Y (X:Y, n) AS Z (X:Z, m) AS X (X:Y, n) AS W
fesn Management • When a link fails, its fesn does not change. • Same value carried in withdrawals. • When <X,Y> is repaired: • AS X increments the sequence number. • Subsequent route announcements carry “updated” fesn. • So a larger fesn always corresponds to “newer” information
4 2 7 5 0 1 3 6 fesnList Propagation Same AS Path, distinct fesnLists [4210] {(0:1, 7)(1:2, 7)(2:4, 14)(4:7, 11)} 11 14 [10] {(0:1, 7)(1:2, 7)} [0] {(0:1, 7)} 14 7 10 7 3 [10] {(0:1, 7)(1:3, 3)} 3 7
4 2 7 5 0 1 3 6 fesnList Propagation After the routes are processed at all nodes Routing Table at AS 7
Invalidating Paths upon Failure • When router generates a withdrawal: • The fesnList of withdrawn route (“path stem”) is attached to the withdrawal. • When router receives a withdrawal: • Invalidates all routes containing the fesnList • Selects a new best path • If best path has changed, it sends new best route to its neighbors, and the withdrawal is piggybacked. • If novalid path, only withdrawal is forwarded.
4 2 7 5 0 1 76310 3 6 Invalidating Paths: Example W: {(1:2, 7), (0:1, 7)} W: {(1:2, 7), (0:1, 7)}
Handling Link Repairs When <X,Y> is repaired: • AS X increments the fesn for the edge • Generates a new route announcement to send to AS Y (reflects updatedfesn) • At AS Y, the route is installed into routing table and a subsequent route update may be generated. • After all updates have been processed, every fesnList containing (X:Y, n) will reflect the updated value.
What about Multiple Edges? • Each edge is associated with a minor fesn • Contrast withmajor fesnfor “logical” AS-AS edge. • All edges between ASes share the same major fesn, but have distinct minorfesn’s. • Minor fesn is incremented with corresponding edge. • major fesn incremented only if all edges are affected.
Minor fesn’s • Minor fesn’s are only used between adjacent ASes. • All routers in AS 6 include minorfesn in route updates. • When the updates exported externally (to AS 7) minor fesn is removed. 4 2 7 5 0 1 7 (11) 3.2 6.1 7 (13) 3.3 6.3 6.2 common major fesn distinct minor fesn’s AS 3 AS 6
fesn – Key Properties • Sequence number is monotonic --- new events will have higher values. • Imposes a partial ordering on the fesnLists. • Old information can be easily detected, and discarded. • Allows compact, correct description of invalid paths i.e. the fesnList in a withdrawalcaptures all obsolete paths.
EPIC Properties • No router will select an invalid path after receiving any update triggered by a single failure event. • No router will select an invalid path after receiving at least one update triggered by each of a set of multiple failure events. • Achieves optimal bounds for a path vector protocol. • Routers may still explore paths. • But these paths are all valid.
EPIC Performance (vs BGP) BGP EPIC Fail Down Fail Over
BGP Routing Dynamics • BGP routing instabilities • BGP routing suffers from many problems, e.g., mis-configurations, link failures, policy changes, slow convergence, etc. • BGP update streams are visible from all BGP-monitoring vantage points. • Open research problems • What are the common characteristics of BGP dynamics? • What are primary causes of BGP routing dynamics? • How to visualize BGP dynamics?
BGP Routing Update (per second)View: UMN Time: 2003/12/07 – 2003/12/14 BGP Update Burst BGP Update Noise Time vs. Number of BGP updates at prefix level
BGP Routing Update (per second) (cont.)View: UMN Time: 2003/12/07 – 2003/12/14 BGP Update Burst BGP Update Noise Time vs. Number of BGP updates at AS level
Modeling BGP Routing Dynamics • Modeling BGP dynamics on all prefixes/ASes is challenging. • ~120, 000 prefixes, ~16,000 ASes • High-dimensional time-series • BGP updates are temporally and spatially correlated