690 likes | 820 Views
Global Internet & BGP. Acknowledgement(texts & figs). Kurose Govindan Zahid Peterson & Davie Kevin. Hierarchies. What? Logical structure overlaid on collections of nodes Why? Together with information abstraction, the only known solution to scaling issues. Routing Hierarchies.
E N D
Acknowledgement(texts & figs) • Kurose • Govindan • Zahid • Peterson & Davie • Kevin
Hierarchies • What? • Logical structure overlaid on collections of nodes • Why? • Together with information abstraction, the only known solution to scaling issues
Routing Hierarchies • Flat routing doesn’t scale • Each node cannot be expected to have routes to every destination (or destination network) • Key observation • Need less information with increasing distance to destination • Two radically different approaches for routing • The area hierarchy • The landmark hierarchy
Autonomous systems • What is an AS? • A set of routers under a single technical administration, using an interior gateway protocol (IGP) and common metrics to route packets within the AS and using an exterior gateway protocol (EGP) to route packets to other AS’s. • sometimes AS’s use multiple IGPs and metrics, but appear as single AS’s to other AS’s.
Global Addresses • Properties • IPv4 uses 32 bit address space • globally unique • hierarchical: network + host • Dot Notation • 10.3.2.4 • 128.96.33.81 • 192.12.69.77 • Assigning authority • Jon Postel ran IANA ‘til ‘98 • Assigned by ICANN 7 24 A: 0 Network Host 14 16 B: 1 0 Network Host 21 8 C: 1 1 0 Network Host Multicast D: 0 1 1 1 Experimental E: 1 1 1 1
How to Make Routing Scale • Flat (Ethernet) versus Hierarchical (Internet) Addresses • All hosts attached to same network have same network address • Problem: inefficient use of Hierarchical Address Space • class C with 2 hosts (2/255 = 0.78% efficient) • class B with 256 hosts (256/65535 = 0.39% efficient) • Problem: still Too Many Networks • routing tables do not scale • Big tables make routers expensive • route propagation protocols do not scale
Today’s Internet • Consists of ISP’s (Internet Service Providers) who run AS’s (Autonomous Systems) • All you need to become an ISP is some address space, an AS number and a peer or two • Easier said than done • Getting addresses and AS number is the tricky part • There are public peering points (MAE East, Central and West) • NAP’s run by MCI where peering can take place • Most peering points are private • Number of connections have been doubling for some time – how do we deal with this kind of scaling?
Network number Host number Class B address 111111111111111111111111 00000000 Subnet mask (255.255.255.0) Network number Subnet ID Host ID Subnetted address Subnetting - 1985 • Original intent was for network to identify one physical network • Lots of small networks are what we actually have – how do we handle this? • Solution: add another level to address/routing hierarchy: subnet • Subnet masks define variable partition of host part • 1’s identify subnet, 0’s identify hosts within the subnet • Mechanism for sharing a single network number among multiple networks • Subnets visible only within a site
Subnetting • Subnetting is the process of creating multiple segments within a single IP network address space • From the perspective of a node outside the network, all nodes on any of the subnetworks appear to be on the original single network • Internet routing tables are not affected by subnetting, I.e. routing tables need not be overloaded with information about routes to all internal subnets, just information to the access router/gateway
Subnetting (Continued) • Classes A, B and C in IP addressing are designed with two levels of hierarchy (netid & hostid) • Problem: An organization with a class B address can not have more than network and all 216 hosts are attached to that network A nightmare in managing this network, single broadcast domain, security issues, etc… • Subnetting create another level of hierarchy (netid, subnetid and hostid). Delivery of IP packets involves three steps; delivery to the site router, delivery to the subnet router, delivery to the host
Subnet mask: 255.255.255.128 Subnet number: 128.96.34.0 128.96.34.15 128.96.34.1 H1 R1 Subnet mask: 255.255.255.128 128.96.34.130 Subnet number: 128.96.34.128 128.96.34.139 128.96.34.129 H2 R2 H3 128.96.33.1 128.96.33.14 Subnet mask: 255.255.255.0 Subnet number: 128.96.33.0 Subnet Example Forwarding table at router R1 Subnet Number Subnet Mask Next Hop 128.96.34.0 255.255.255.128 interface 0 128.96.34.128 255.255.255.128 interface 1 128.96.33.0 255.255.255.0 R2
141.14.2.105 141.14.2.21 141.14.2.20 A network with two levels of hierarchy 141.14.7.96 To the internet 141.14.0.0 R 141.14.7.95 141.14.22.64 141.14.7.44 141.14.22.8
A network with three levels of hierarchy 141.14.2.20 141.14.2.21 141.14.2.105 141.14.2.0 R 141.14.7.96 141.14.7.0 To the internet 141.14.4.45 141.14.22.0 The previous network is divided into 3 subnets 141.14.22.9 141.14.22.64
Subnet Masking • Subnetting is achieved by “stealing” some bits from the hostid field to represent the subnet portion of the address • Those bits that are used for the subnetid are identified through the use of a subnet mask • Masking is the process of extracting the address of the physical network (if subnetting is not used) or the subnet address (if subnetting is used) from an IP address • A subnet mask is a 32-bit pattern having a “1” in every netid and subnetid locations and a “0” in every hostid location
Mask 255.255.0.0 IP address 141.14.2.21 Network address 141.14.0.0 Subnet masking (Continued) • Subnet masking is performed (both at the host and at the router) by applying “bit-wise-and” operation between the IP address and the subnet mask • Example 1: Class B network without subnetting • 141.14.2.21 is 10001101.00001110.00000010.00010101 • 255.255.0.0 is 11111111.11111111.00000000.00000000 • “Bit-wise and”10001101.00001110.00000000.00000000
Mask 255.255.255.0 IP address 141.14.2.21 Subnet address 141.14.2.0 Subnet Masking (Continued) • Example 2: Class B network with subnetting • 141.14.2.21 is 10001101.00001110.00000010.00010101 • 255.255.255.0 is 11111111.11111111.11111111.00000000 • “Bit-wise and” is 10001101.00001110.00000010.00000000 • The subnet address is hence141.14.2.0
Forwarding Algorithm D = destination IP address for each entry (SubnetNum, SubnetMask, NextHop) D1 = SubnetMask & D if D1 = SubnetNum if NextHop is an interface deliver datagram directly to D else deliver datagram to NextHop
Forwarding • Use a default router if nothing matches • Not necessary for all 1s in subnet mask to be contiguous • Can put multiple subnets on one physical network • Subnets not visible from the rest of the Internet • This is a simple, toy example!!
Subnets • Subnetting is not the only way to solve scalability problems • Additional router support is necessary to include netmask and forwarding functionality • Non-contiguous netmask numbers can be used • They make administration more difficult • Multiple subnets can reside on a single network • Requires routers within the network • Subnets help solve scalability problems • Do not require us to use class B or C address for each physical network • Help us to aggrigate information • Chief advantage of IP addresses: routers could keep one entry per network instead of one per destination host
Continued Problems with IPv4 Addresses • Problem: • Potential exhaustion of IPv4 address space (due to inefficiency) • Class B network numbers are highly prized • Lots of class C addresses but no one wants them • Growth of back bone routing tables • We don’t want lots of small networks since this causes large routing tables • Solution: • Allow addresses assigned to a single entity to span multiple classed prefixes • Enhance route aggregation
Supernetting • Assign block of contiguous network numbers to nearby networks • Called CIDR: Classless Inter-Domain Routing • Breaks rigid boundries between address classes • If ISP needs 16 class C addresses, make them contiguous • Eg.192.4.16 to 192.4.31 enables a 20-bit network number • Represent blocks (number of class C networks) with a single pair (first_network_address, count) • Restrict block sizes to powers of 2 • Use a bit mask (CIDR mask) to identify block size • All routers must understand CIDR addressing
host part network part 11001000 0001011100010000 00000000 200.23.16.0/23 IP addressing: CIDR • CIDR:Classless InterDomain Routing • network portion of address of arbitrary length • address format: a.b.c.d/x, where x is # bits in network portion of address
GlobalInternetRoutingMesh 204.71.0.0 Service Provider 204.71.1.0 204.71.2.0 204.71.0.0/16 204.71.3.0 204.71.4.0 CIDR (continued) • Why? • Reduce amount of global routing information via aggregation
CIDR Addresses • Identifying a CIDR block requires both an address and a mask • Slash notation • 128.211.168.0/21 for addresses 128.211.168.0 – 128.211.175.255 • Here the /21 indicates a 21 bit mask • All possible CIDR masks can easily be generated • /8, /16, /24 correspond to traditional class A, B, C categories • IP addresses are now arbitrary integers, not classes • Raises interesting questions about lookups • Routers cannot determine the division between prefix and suffix just by looking at the address • Hashing does not work well • Interesting lookup algorithms have been developed and analyzed
CIDR – A Couple Details • ISP’s can further subdivide their blocks of addresses using CIDR • Some prefixes are reserved for private addresses • 10/8, 172.16/12, 192.168/16, 169.254/16 • These are not routable in the Internet
AS Numbers (ASNs) ASNs are 16 bit values. 64512 through 65535 are “private” Currently over 11,000 in use. • Genuity: 1 • MIT: 3 • Harvard: 11 • UC San Diego: 7377 • AT&T: 7018, 6341, 5074, … • UUNET: 701, 702, 284, 12199, … • Sprint: 1239, 1240, 6211, 6242, … • … ASNs represent units of routing policy
How Many ASNs are there? Thanks to Geoff Huston. http://www.telstra.net/ops on June 23, 2001
When will we run out of ASNs? 64,511 2005? 2007?
Internet inter-AS routing: BGP • BGP (Border Gateway Protocol):the de facto standard • Path Vector protocol: • similar to Distance Vector protocol • each Border Gateway broadcast to neighbors (peers) entire path (I.e, sequence of ASs) to destination • E.g., Gateway X may send its path to dest. Z: Path (X,Z) = X,Y1,Y2,Y3,…,Z
Internet inter-AS routing: BGP Suppose: gateway X send its path to peer gateway W • W may or may not select path offered by X • cost, policy (don’t route via competitors AS), loop prevention reasons. • If W selects path advertised by X, then: Path (W,Z) = w, Path (X,Z) • Note: X can control incoming traffic by controling it route advertisements to peers: • e.g., don’t want to route traffic to Z -> don’t advertise any routes to Z
Internet inter-AS routing: BGP • BGP messages exchanged using TCP. • BGP messages: • OPEN: opens TCP connection to peer and authenticates sender • UPDATE: advertises new path (or withdraws old) • KEEPALIVE keeps connection alive in absence of UPDATES; also ACKs OPEN request • NOTIFICATION: reports errors in previous msg; also used to close connection
Why different Intra- and Inter-AS routing ? Policy: • Inter-AS: admin wants control over how its traffic routed, who routes through its net. • Intra-AS: single admin, so no policy decisions needed Scale: • hierarchical routing saves table size, reduced update traffic Performance: • Intra-AS: can focus on performance • Inter-AS: policy may dominate over performance
Architecture of Dynamic Routing OSPF BGP AS 1 EIGRP IGP = Interior Gateway Protocol Metric based: OSPF, IS-IS, RIP, EIGRP (cisco) AS 2 EGP = Exterior Gateway Protocol Policy based: BGP The Routing Domain of BGP is the entire Internet
OSPF Process OSPF Routing tables RIP Process RIP Routing tables BGP Process BGP Routing tables Many Routing Processes Can Run on a Single Router BGP OS kernel RIP Domain OSPF Domain Forwarding Table Manager Forwarding Table
AS categories • Stub: an AS that has only a single connection to one other AS - carries only local traffic. • Multihomed: an AS that has connections to more than one AS, but refuses to carry transit traffic • Transit: an AS that has connections to more than one AS, and carries both transit and local traffic (under certain policy restrictions)
IP traffic Nontransit vs. Transit ASes Internet Service providers (often) have transit networks ISP 2 ISP 1 NET A Nontransit AS might be a corporate or campus network. Could be a “content provider” Traffic NEVER flows from ISP 1 through NET A to ISP 2 (At least not intentionally!)
IP traffic Selective Transit NET B NET C NET A provides transit between NET B and NET C and between NET D and NET C NET A DOES NOT provide transit Between NET D and NET B NET A NET D Most transit networks transit in a selective manner…
Choices for global routing • Link state or distance vector? • no universal metric - policy decisions • Problems with distance-vector: • Bellman-Ford algorithm may not converge • Problems with link state: • metric used by routers not the same - loops • LS database too large - entire Internet • may expose policies to other AS’s
Solution: Path Vectors • Each routing update carries the entire path • Loops are detected as follows: • when AS gets route check if AS already in path • if yes, reject route • if no, add self and advertise route further • Advantage: • metrics are local - AS chooses path, protocol ensures no loops
AS 1239 Sprint ASPATH Attribute AS 1129 135.207.0.0/16 AS Path = 1755 1239 7018 6341 Global Access AS 1755 135.207.0.0/16 AS Path = 1239 7018 6341 135.207.0.0/16 AS Path = 1129 1755 1239 7018 6341 Ebone AS 12654 RIPE NCC RIS project 135.207.0.0/16 AS Path = 7018 6341 AS7018 135.207.0.0/16 AS Path = 3549 7018 6341 135.207.0.0/16 AS Path = 6341 AT&T AS 3549 AS 6341 135.207.0.0/16 AS Path = 7018 6341 AT&T Research Global Crossing 135.207.0.0/16 Prefix Originated
Interdomain Loop Prevention AS 7018 BGP at AS YYY will never accept a route with ASPATH containing YYY. Don’t Accept! 12.22.0.0/16 ASPATH = 1 333 7018 877 AS 1
Problems • Routing table size • need an entry for all paths to all networks • Required memory= O(N + M*A) * K) • N: number of networks • M: mean AS distance • A: number of AS’s • K: number of BGP peers • Problem reduced with CIDR
Routing information bases (RIB) • Routes are stored in RIBs • Adj-RIBs-In: routing info that has been learned from other routers (unprocessed routing info) • Loc-RIB: local routing information selected from Adj-RIBs-In (routes selected locally) • Adj-RIBs-Out: info to be advertised to peers (routes to be advertised)
Conceptual Mode of Operation • RIB = Routing information base Adj-RIB-In Adj-RIB-Out Loc-RIB per BGP neighbor per BGP neighbor
Policy with BGP • BGP provides capability for enforcing various policies • Policies are not part of BGP: they are provided to BGP as configuration information • BGP enforces policies by choosing paths from multiple alternatives and controlling advertisement to other AS’s