190 likes | 266 Views
More on BGP. Check out the links on politics: ICANN and net neutrality To read for next time Path selection big example Scaling of BGP. Decision Process. Step 1: each border router Selects which of the paths advertised by its external neighbor can be imported May violate import policy
E N D
More on BGP • Check out the links on politics: ICANN and net neutrality • To read for next time • Path selection big example • Scaling of BGP
Decision Process • Step 1: each border router • Selects which of the paths advertised by its external neighbor can be imported • May violate import policy • May have routing loop • Picks the best ones • Optionally assign a LOCAL_PREF • Sends its best paths to all its internal neighbors • Step 2: All border routers compare all the paths learned through their neighbors and pick the single best path to each prefix
Path Selection Process • For each of the paths to the same NLRI • Ignore if next-hop is down • Prefer Higher local preference • If same Prefer locally originated route • If same Prefer Shortest AS path • If same Prefer Lowest origin (IGP better that EGP) • If same Prefer Lowest MED • If same Prefer eBGP over iBGP • If same Prefer Path with lowest IGP cost to next-hop • If same Prefer Lowest router id • If same Prefer Lowest neighbor IP address • Different vendors may have some slight variations! • The RFC does not specify all the details (e.g. the preference for shorter AS paths
After path selection • Export routes to external neighbors • According to export policy • Add the local AS in the AS_PATH • Remove LOCAL PREF and optionally add MULTI_EXIT_DISC • Originate routes inside the domain • Through the appropriate IGP mechanisms • OSPF has AS-external routes • And can use the IGP cost to next-hop to select which one is better…
OSPF external routes • Certain routers in OSPF are ASBRs • Inject external routes (type-3 LSAs) • With themselves as the next hops • OSPF routers compute next hops for these routes using the cost to reach the ASBR that originated them • And install these in the RIB • BUT: clearly I can not inject all the routes in the Internet into OSPF, it will not scale • We will see solutions to this later on
BGP implementation details • Uses TCP to talk to neighbors • Provides reliability • Important since I have to exchange a lot of information • Allows to send only changes and not all the information periodically like EGP • But TCP has its own mechanisms for flow control and connectivity loss detection that may cause surprises
BGP messages • Since TCP is a stream protocol I need a way to delimit packets • BGP header • Messages • OPEN • Used to negotiate parameters when establishing adjacency • UPDATE • Contains the reachability information • NOTIFICATION • Send when an error occurs • KEEPALIVE • To check the health of the adjacency
Paths • BGP manages paths • Path consists of • Network Layer Reachability Information (NLRI) e.g 12.50.45/24 • A sequence of PATH attributes that give info related to this destination • PATH attributes • Each have a Flags field • Optional or well known (well known must be supported by all routers) • Transitive or local (Transitive gets propagated, local not) • Partial or not (partial applies only to part of the path)
Important path attributes • ORIGIN (well known) • Is this path learned from IGP, BGP or other • AS_PATH • The list of ASes (well known) • NEXT_HOP • Next hop to reach the prefix (well known) • MULTI_EXIT_DISC (MED) • Helps selection of paths (local, optional) • LOCAL_PREF • Helps selection of paths (well known)
Operation • Establish a TCP connection with the neighbor • Negotiate options and establish adjacency • Send all the path info to the neighbor, receive its path info and store it • Then only send updates/changes • If an update is received that results in a different route process it and update neighbors • Send keep-alives to monitor the health of the link • Does not depend on TCP keep-alive mechanism
Prefix aggregation • Need to be able to aggregate prefixes • AS_PATH contains a number of AS_SET and/or AS_SEQUENCE • AS_SET: unordered set of ASes traversed • AS_SEQUENCE: ordered list of ASes • When aggregating…. • Longest common prefix of the AS_SEQUENCE becomes the aggregated AS_SEQUENCE • Union of all the rest becomes the new AS_SET • Can repeat recursively • EXAMPLE: why it is good for the network
Scaling • Route refresh • Route dampening • RR
Route Refresh • Each time policy changes I need to find out about all the paths from the neighbor • A path that was not valid before (and was discarded) may be valid under the new policy • I could do a reset (I.e. re-establish adjacency with peer to relearn all the paths) • Expensive and network disruptive • Instead, initiate a route refresh • Tell neighbor to resend me all its paths so I can re-evaluate policy • “Route Refresh” capability, must be supported by both peer routers
Route Reflectors • All BRs need to talk iBGP to all others • Full mesh does not scale • Use Route Reflectors (RR) • Multiple RRs in an AS • A RR has clients and may be connected (through iBGP sessions) to other RRs • Much fewer iBGP sessions • RR computes paths based on information it receives • If path is learned from client, send it to all other clients and RRs • If path is learned from other RR send it only to clients
More on RRs • Essentially split the network into clusters • One RR per cluster and multiple clients • All RRs are fully meshed • Use a cluster-id • Need to have RR redundancy • Not a good idea to have two RRs per cluster • Each client talks to two RRs
Extensions to BGP over time • Communities • Logically group paths together for easier administration • Can apply policies to a community and not to individual paths • Simpler management, no need to add new policy rules for new paths • Multi-protocol • Carry different NLRIs • Ipv4, ipv6, multicast • VPNs (we will them later) • Graceful restart (we will see it later) • 4 bytes ASNs
Issues with BGP • BGP dynamics are complex! • We will see it through some papers • Understand the structure of the internet and the properties of the topology • How to measure it? • Hot potato routing • Multi-homing and provider selection • Understand the convergence properties • Role of policies in convergence • Policy consistency • Effect of the various mechanisms used
Convergence Dynamics • 4 types of advertisements • Withdraw, new, longer, shorter • Withdraw and longer converge slower than new and shorter. Why? • BGP explores alternate paths in case of withdraw and longer • EXAMPLE • Worse case can be n! steps • In theory in fully connected graphs of course • Also in BGP I can have a timer to collect multiple paths before I process them
Route Dampening • Route flaps can be a pain • A link that is flapping can cause ripples around the whole Internet • Penalize flapping paths • Keep track of updates per peer and per path • If it flaps too much stop announcing it • Parameters: additive penalty, half-life, suppress threshold, reuse threshold • Can be different on a per prefix/ per peer basis etc • It was considered a good idea until 2000 or so • Then people discovered that it was not working that well • A single path withdrawal could appear like a flap as it propagates in the network • Makes the path unavailable for long time (10s of minutes)