580 likes | 837 Views
Inter-Domain Routing. Routing Hierarchies. Flat routing doesn’t scale Each node cannot be expected to have routes to every destination (or destination network) Key observation Need less information with increasing distance to destination Two radically different approaches for routing
E N D
Routing Hierarchies • Flat routing doesn’t scale • Each node cannot be expected to have routes to every destination (or destination network) • Key observation • Need less information with increasing distance to destination • Two radically different approaches for routing • The area hierarchy • The landmark hierarchy
1 2 2.1 2.2 1.1 2.2.1 1.2 3 3.2 3.1 The Area Hierarchy
Areas • Technique for hierarchically addressing nodes in a network • Divide network into areas • Areas can overlap • Areas can have nested sub-areas • Constraint: no path between two sub-areas of an area can exit that area • sub-areas must be completely contained within area
Addressing • Address areas hierarchically • sequentially number top-level areas • sub-areas of area are labeled relative to that area • nodes are numbered relative to the smallest containing area • nodes can have multiple addresses
Routing • Within area • each node has routes to every other node • Outside area • each node has routes for other top-level areas only • inter-area packets are routed to nearest border router • Can result in sub-optimal paths
1 2 2.1 2.2 1.1 2.2.1 1.2 3 3.2 3.1 Path Suboptimality 3 hop red path vs 2 hop green path
Landmark hierarchy • Details about things nearby and less information about things far away • Not defined by arbitrary boundaries • (therefore not well suited to the real world that does have administrative boundaries)
Landmark Overview • Landmark routers have “height” which determines how far away they can be seen (visibility) • Routers within Radius n can see a landmark router LM(n) • See means that those routers have LM(n)’s address and know next hop to reach it. • Router x as an entry for router y if x is within radius of y • Distance vector style routing with simple metric • Routing table: Landmark (LM2(d)), Level(2), Next hop
LM Hierarchy Definition • Each LM (Li) associated with level (i) and radius (ri) • Every node is an L0 landmark • Recursion: some Li are also Li+1 • every Li sees at least one Li+1 • Terminating state when all level j LMs see entire network
y a X b LM addresses • LM(2).LM(1).LM(0) (x.a.b and y.a.b) • LM level maps to radius (part of configuration) • LM level 0: radius 2 • LM level 1: radius 4 • LM level 2: radius 8 • If destination is more than two hops away, will not have complete routing information, refer to LM(1) portion of address, if not then refer to LM(2)..(c would forward based on y in y.a.b) c
LM Routing • LM does not imply hierarchical forwarding • It is NOT a source route • En route to LM(1) may encounter router that is within LM(0) radius of destination address (like longest match) • Paths may be asymmetric
LM self-configuration • Bottom-up hierarchy construction algorithm • goal to bound number of children • Every router is L0 landmark • All routers advertise themselves over a distance • All Li landmarks run election to self-promote one or more Li+1 landmarks • Dynamic algorithm to adapt to topology changes--Efficient hierarchy
Sources • RFC1771: main BGP RFC • RFC1772-3-4: application, experiences, and analysis of BGP • RFC1965: AS confederations for BGP • Christian Huitema’s book “Routing in the Internet”, chapters 8 and 9.
Autonomous systems • What is an AS? • A set of routers under a single technical administration, using an interior gateway protocol (IGP) and common metrics to route packets within the AS and using an exterior gateway protocol (EGP) to route packets to other AS’s. • sometimes AS’s use multiple IGPs and metrics, but appear as single AS’s to other AS’s.
Example 1 2 IGP 2.1 2.2 IGP EGP 1.1 2.2.1 1.2 EGP EGP EGP 3 4.2 4.1 IGP EGP 4 IGP 5 3.2 3.1 IGP 5.2 5.1
History • Mid-80s: EGP • reachability protocol (no shortest path) • did not accommodate cycles (tree topology) • evolved when all networks connected to ARPANET • Limited size network topology • Result: BGP introduced as routing protocol
Choices • Link state or distance vector? • no universal metric - policy decisions • Problems with distance-vector: • Bellman-Ford algorithm may not converge • Problems with link state: • metric used by routers not the same - loops • LS database too large - entire Internet • may expose policies to other AS’s
Solution: Path Vectors • Each routing update carries the entire path • Loops are detected as follows: • when AS gets route check if AS already in path • if yes, reject route • if no, add self and advertise route further • Advantage: • metrics are local - AS chooses path, protocol ensures no loops
Problems • Routing table size • need an entry for all paths to all networks • Required memory= O(N + M*A) * K) • N: number of networks • M: mean AS distance • A: number of AS’s • K: number of BGP peers • Problem reduced with CIDR
Interior BGP peers • IGP cannot propagate all the information required by BGP • External routers in an AS use interior BGP (IBGP) connections to communicate • External routers agree on routes and inform IGP IBGP
Interconnecting BGP peers • BGP uses TCP to connect peers • Advantages: • BGP much simpler • no need for periodic refresh • incremental updates • Disadvantages • congestion control on a routing protocol?
Hop-by-hop model • BGP advertises to neighbors only those routes that it uses • consistent with the hop-by-hop Internet paradigm • e.g., AS1 cannot tell AS2 to route to other AS’s in a manner different than what AS2 has chosen (need source routing for that)
AS categories • Stub: an AS that has only a single connection to one other AS - carries only local traffic. • Multihomed: an AS that has connections to more than one AS, but refuses to carry transit traffic • Transit: an AS that has connections to more than one AS, and carries both transit and local traffic (under certain policy restrictions)
Policy with BGP • BGP provides capability for enforcing various policies • Policies are not part of BGP: they are provided to BGP as configuration information • BGP enforces policies by choosing paths from multiple alternatives and controlling advertisement to other AS’s
Examples of BGP policies • A multihomed AS refuses to act as transit • limit path advertisement • A multihomed AS can become transit for some AS’s • only advertise paths to some AS’s • An AS can favor or disfavor certain AS’s for traffic transit from itself
BGP-4 • Latest version of BGP • BGP-4 supports CIDR
Routing information bases (RIB) • Routes are stored in RIBs • Adj-RIBs-In: routing info that has been learned from other routers (unprocessed routing info) • Loc-RIB: local routing information selected from Adj-RIBs-In (routes selected locally) • Adj-RIBs-Out: info to be advertised to peers (routes to be advertised)
BGP common header 1 2 3 0 Marker (security and message delineation) Length Type Types: OPEN, UPDATE, NOTIFICATION, KEEPALIVE
BGP OPEN message 1 2 3 0 Marker (security and message delineation) Length Type: update version My autonomous system Hold time BGP identifier Optional parameters <type, length, value> Parameter length My AS: id assigned to that AS Hold timer: max interval between KEEPALIVE or UPDATE messages BGP ID: address of one interface (same for all messages)
BGP UPDATE message 1 2 3 0 Marker (security and message delineation) Length Type: open Withdrawn.. ..routes len Withdrawn routes (variable) ... Path attribute len Path attributes (variable) Network layer reachability information (NLRI) (variable) UPDATE message reports information on a SINGLE path, but can report multiple withdrawn routes
NLRI • Network Level Reachability Information • list of IP address prefixes encoded as follows: Length (1 byte) Prefix (variable)
BGP NOTIFICATION message 1 2 3 0 Marker (security and message delineation) Length Type: NOTIFICATION Error code Data Error sub-code Used for error notification
BGP KEEPALIVE message 1 2 3 0 Marker (security and message delineation) Length Type: KEEPALIVE Sent periodically to peers to ensure connectivity If hold_time is zero, messages are not sent
Policy routing T Z Y X V U Assume Y forbids T’s traffic T cannot reach X, but X can reach T!
Selecting the Best Path • May be entirely manual or automated based on some configuration data • Full set of attributes allows more sophisticated path selection • Example Strategies: • hop count • assign weights (same in our border routers) • avoid EGP AS
Path Selection Criteria • Information based on path attributes • Attributes + external (policy) information • Examples: • hop count • policy considerations • presence or absence of certain AS • path origin • link dynamics
Path Attributes • Categories: • well-known mandatory (passed on) • well-known discretionary (passed on) • optional transitive (passed on) • optional non-transitive (if unrecognized, not passed on)
Attribute Message Format Attribute flags Attribute type code O T P E 0 O: optional or well-known T: transitive or local P: partially evaluated E: length in 1 or 2 bytes Origin AS_path Next hop etc.
AS_PATH Attribute • Well-known, mandatory • Important components: • list of traversed AS’s • If forwarding to internal peer: • do not modify AS_PATH attribute • If forwarding to external peer: • prepend self into the path
Other Path Attributes • ORIGIN: specifies the AS that originated the information- included in update messages of all who propagate this information • NEXT_HOP: IP address of border router to be used as next hop
NEXT_HOP AS X AS border A B C Suppose C, D are peers. C may want to indicate to D some paths for which next hop is either A or B D E F AS Y
More Path Attributes • LOCAL_PREF: provided by a BGP router to all other internal BGP routers. Denotes degree of preference for each destination.
UPDATE Message Handling • Unrecognized, optional non-transitive attributes are ignored. Unrecognized optional transitive attributes cause the Partial bit to be set. • WITHDRAWN routes are processed first • Feasible routes are placed in Adj-RIB-In, replacing old ones if any.
Decision Process • Calculate degree of preference for each route in Adj-RIB-In • Choose best route, install in Loc-RIB • Disseminate routes to peers, update Adj-RIB-Out
CIDR and BGP AS X 197.8.2.0/24 AS T (provider) 197.8.0.0/23 AS Z AS Y 197.8.3.0/24 What should T announce to Z?