500 likes | 579 Views
Interdomain Routing and The Border Gateway Protocol (BGP). Today’s Big Picture. Large ISP. Large ISP. Stub. Small ISP. GPRS. Access Network. Stub. Stub. Large number of diverse networks. Internet AS Map: caida.org. Autonomous System(AS). Internet is not a single network
E N D
Today’s Big Picture Large ISP Large ISP Stub Small ISP GPRS Access Network Stub Stub Large number of diverse networks
Autonomous System(AS) • Internet is not a single network • Collection of networks controlled by different administrations • An autonomous system is a network under a single administrative control • IANA • An AS owns unique IP prefixes • Every AS has a unique AS number • ASes need to inter-network themselves to form a single virtual global network • Need a common protocol for communication
Who speaks Inter-AS routing? R R2 R1 R3 AS2 AS1 BGP border router internal router • Two types of routers • Border router(Edge) • Internal router(Core) • Two border routers of different ASes will have a BGP session
Autonomous Systems (ASes) … the administration of an AS appears to other ASes to have a single coherent interior routing plan and presents a consistent picture of what networks are reachable through it. RFC 1930: Guidelines for creation, selection, and registration of an Autonomous System • An autonomous system is an autonomous routing domain that has been assigned an Autonomous System Number (ASN). • All parts within an AS remain connected.
IP Address Allocation and Assignment: Internet Registries IANA www.iana.org APNIC www.apnic.org ARIN www.arin.org RIPE www.ripe.org Allocate to National and local registries and ISPs Addresses assigned to customers by ISPs RFC 2050 - Internet Registry IP Allocation Guidelines RFC 1918 - Address Allocation for Private Internets RFC 1518 - An Architecture for IP Address Allocation with CIDR
Whois servers (AS, IP) • http://www.ripe.net/perl/whois • AS2588 • http://ws.arin.net/cgi-bin/whois.pl • AS701 • http://www.apnic.net/apnic-bin/whois.pl • AS4808
AS Numbers (ASNs) ASNs are 16 bit values. 64512 through 65535 are “private” Currently over 20,000 in use. • Genuity: 1 • MIT: 3 • JANET: 786 • UC San Diego: 7377 • AT&T: 7018, 6341, 5074, … • UUNET: 701, 702, 284, 12199, … • Sprint: 1239, 1240, 6211, 6242, … • … ASNs represent units of routing policy
Partial View of www.cl.cam.ac.uk (128.232.0.20) Neighborhood AS 20757 Hanse AS 5089 NTL Group AS 3356 Level 3 AS 3257 Tiscali AS 6461 AboveNet AS 1239 Sprint AS 702 UUNET AS 13127 Versatel AS 4637 REACH AS 20965 GEANT AS 786 ja.net AS 5459 LINX AS 1213 HEAnet (Irish academic and research) Originates > 180 prefixes, Including 128.232.0.0/16 AS 4373 Online Computer Library Center AS 7 UK Defense Research Agency
How Many ASNs are there today? 12,940 origin only (no transit) 18,217 Thanks to Geoff Huston. http://bgp.potaroo.net on October 26, 2004
IP network assignment process RIR IANA IETF ISP Allocation Delegation Announcement Allocation ISP BGP RIR IANA
RIR Allocations - Current Allocated
How many prefixes today? 179,903 Note: numbers actually depends point of view… Thanks to Geoff Huston. http://bgp.potaroo.net on October 26, 2004
The Gang of Four Link State Vectoring OSPF RIP IGP EIGRP BGP EGP
BGP-4 • BGP = Border Gateway Protocol • Is a Policy-Based routing protocol • Is the de facto EGP of today’s global Internet • Relatively simple protocol, but configuration is complex and the entire world can see, and be impacted by, your mistakes. • 1989 : BGP-1 [RFC 1105] • Replacement for EGP (1984, RFC 904) • 1990 : BGP-2 [RFC 1163] • 1991 : BGP-3 [RFC 1267] • 1995 : BGP-4 [RFC 1771] • Support for Classless Interdomain Routing (CIDR)
The Border Gateway Protocol (BGP) BGP = RFC 1771 + “optional” extensions RFC 1997 (communities) RFC 2439 (damping) RFC 2796 (reflection) RFC3065 (confederation) … + routing policy configuration languages (vendor-specific) + Current Best Practices in management of Interdomain Routing BGP was not DESIGNED. It EVOLVED.
BGP Operations (Simplified) Establish session on TCP port 179 AS1 BGP session Exchange all active routes AS2 While connection is ALIVE exchange route UPDATE messages Exchange incremental updates
Four Types of BGP Messages • Open : Establish a peering session. • Keep Alive : Handshake at regular intervals. • Notification : Shuts down a peering session. • Update : Announcing new routes or withdrawing previously announced routes. announcement = prefix + attributes values
BGP Attributes Value Code Reference ----- --------------------------------- --------- 1 ORIGIN [RFC1771] 2 AS_PATH [RFC1771] 3 NEXT_HOP [RFC1771] 4 MULTI_EXIT_DISC [RFC1771] 5 LOCAL_PREF [RFC1771] 6 ATOMIC_AGGREGATE [RFC1771] 7 AGGREGATOR [RFC1771] 8 COMMUNITY [RFC1997] 9 ORIGINATOR_ID [RFC2796] 10 CLUSTER_LIST [RFC2796] 11 DPA [Chen] 12 ADVERTISER [RFC1863] 13 RCID_PATH / CLUSTER_ID [RFC1863] 14 MP_REACH_NLRI [RFC2283] 15 MP_UNREACH_NLRI [RFC2283] 16 EXTENDED COMMUNITIES [Rosen] ... 255 reserved for development Most important attributes Not all attributes need to be present in every announcement From IANA: http://www.iana.org/assignments/bgp-parameters
Attributes are Used to Select Best Routes 192.0.2.0/24 pick me! 192.0.2.0/24 pick me! 192.0.2.0/24 pick me! Given multiple routes to the same prefix, a BGP speaker must pick at most one best route (Note: it could reject them all!) 192.0.2.0/24 pick me!
BGP Route Processing Open ended programming. Constrained only by vendor configuration language Apply Policy = filter routes & tweak attributes Apply Policy = filter routes & tweak attributes Receive BGP Updates Based on Attribute Values Best Routes Transmit BGP Updates Apply Import Policies Best Route Selection Best Route Table Apply Export Policies Install forwarding Entries for best Routes. IP Forwarding Table
Route Selection Summary Highest Local Preference Enforce relationships Shortest ASPATH Lowest MED traffic engineering i-BGP < e-BGP Lowest IGP cost to BGP egress Throw up hands and break ties Lowest router ID
BGP Routing Tables show ip bgp BGP table version is 111849680, local router ID is 203.62.248.4 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal Origin codes: i - IGP, e - EGP, ? - incomplete Network Next Hop Metric LocPrf Weight Path . . . *>i192.35.25.0 134.159.0.1 50 0 16779 1 701 703 i *>i192.35.29.0 166.49.251.25 50 0 5727 7018 14541 i *>i192.35.35.0 134.159.0.1 50 0 16779 1 701 1744 i *>i192.35.37.0 134.159.0.1 50 0 16779 1 3561 i *>i192.35.39.0 134.159.0.3 50 0 16779 1 701 80 i *>i192.35.44.0 166.49.251.25 50 0 5727 7018 1785 i *>i192.35.48.0 203.62.248.34 55 0 16779 209 7843 225 225 225 225 225 i *>i192.35.49.0 203.62.248.34 55 0 16779 209 7843 225 225 225 225 225 i *>i192.35.50.0 203.62.248.34 55 0 16779 3549 714 714 714 i *>i192.35.51.0/25 203.62.248.34 55 0 16779 3549 14744 14744 14744 14744 14744 14744 14744 14744 i . . . • Use “whois” queries to associate an ASN with “owner” (for example, http://www.arin.net/whois/arinwhois.html) • 7018 = AT&T Worldnet, 701 =Uunet, 3561 = Cable & Wireless, … Thanks to Geoff Huston. http://www.telstra.net/ops on July 6, 2001
Policy : Transit vs. Nontransit A transit AS allows traffic with neither source nor destination within AS to flow across the network AS 701 AT&T CBB AS 701 UUnet AS144 A nontransit AS allows only traffic originating from AS or traffic with destination within AS Bell Labs IP traffic
Customers and Providers provider customer IP traffic provider customer Customer pays provider for access to the Internet
The “Peering” Relationship peer peer provider customer Peers provide transit between their respective customers Peers do not provide transit between peers Peers (often) do not exchange $$$ traffic allowed traffic NOT allowed
Peering Provides Shortcuts peer peer provider customer Peering also allows connectivity between the customers of “Tier 1” providers.
Peering Wars Reduces upstream transit costs Can increase end-to-end performance May be the only way to connect your customers to some part of the Internet (“Tier 1”) You would rather have customers Peers are usually your competition Peering relationships may require periodic renegotiation Peer Don’t Peer Peering struggles are by far the most contentious issues in the ISP world! Peering agreements are often confidential.
Policy-Based vs. Distance-Based Routing? YES NO Host 1 Cust1 Minimizing “hop count” can violate commercial relationships that constrain inter- domain routing. ISP1 ISP3 Host 2 ISP2 Cust3 Cust2
What is Routing Policy • Policy refers to arbitrary preference among a menu of available routes (based upon routes’ attributes) • Public description of the relationship between external BGP peers • Can also describe internal BGP peer relationship • Eg: Who are my BGP peers • What routes are • Originated by a peer • Imported from each peer • Exported to each peer • Preferred when multiple routes exist • What to do if no route exists?
Routing Policy Example • AS1 originates prefix “d” • AS1 exports “d” to AS2, AS2 imports • AS2 exports “d” to AS3, AS3 imports • AS3 exports “d” to AS5, AS5 imports
Routing Policy Example (cont) • AS5 also imports “d” from AS4 • Which route does it prefer? • Does it matter? • Consider case where • AS3 = Commercial Internet • AS4 = Internet2
Import and Export Policies • Inbound filtering controls outbound traffic • filters route updates received from other peers • filtering based on IP prefixes, AS_PATH, community • Outbound Filtering controls inbound traffic • forwarding a route means others may choose to reach the prefix through you • not forwarding a route means others must use another router to reach the prefix • Attribute Manipulation • Import: LOCAL_PREF (manipulate trust) • Export: AS_PATH and MEDs
ASPATH Attribute AS 1239 Sprint AS 1129 135.207.0.0/16 AS Path = 1755 1239 7018 6341 Global Access AS 1755 135.207.0.0/16 AS Path = 1239 7018 6341 135.207.0.0/16 AS Path = 1129 1755 1239 7018 6341 Ebone AS 12654 RIPE NCC RIS project 135.207.0.0/16 AS Path = 7018 6341 AS7018 135.207.0.0/16 AS Path = 3549 7018 6341 135.207.0.0/16 AS Path = 6341 AT&T AS 3549 AS 6341 135.207.0.0/16 AS Path = 7018 6341 AT&T Research Global Crossing 135.207.0.0/16 Prefix Originated
Shorter Doesn’t Always Mean Shorter Mr. BGP says that path 4 1 is better than path 3 2 1 In fairness: could you do this “right” and still scale? Exporting internal state would dramatically increase global instability and amount of routing state Duh! AS 4 AS 3 AS 2 AS 1
Tweak Tweak Tweak (TE) • For inbound traffic • Filter outbound routes • Tweak attributes on outbound routes in the hope of influencing your neighbor’s best route selection • For outbound traffic • Filter inbound routes • Tweak attributes on inbound routes to influence best route selection outbound routes inbound traffic inbound routes outbound traffic In general, an AS has more control over outbound traffic
LOCAL PREFERENCE Local preference used ONLY in iBGP AS 4 local pref = 80 AS 3 local pref = 90 local pref = 100 AS 2 AS 1 Higher Local preference values are more preferred 13.13.0.0/16
Implementing Backup Links with Local Preference (Outbound Traffic) AS 1 primary link backup link Set Local Pref = 100 for all routes from AS 1 Set Local Pref = 50 for all routes from AS 1 AS 65000 Forces outbound traffic to take primary link, unless link is down. We’ll talk about inbound traffic soon …
Multihomed Backups (Outbound Traffic) AS 1 AS 3 provider provider primary link backup link Set Local Pref = 100 for all routes from AS 1 Set Local Pref = 50 for all routes from AS 3 AS 2 Forces outbound traffic to take primary link, unless link is down.
ASpath prepending AS 1 AS 3 provider provider 192.0.2.0/24 ASPATH = 2 192.0.2.0/24 ASPATH = 2 2 2 2 2 2 2 2 2 2 2 2 2 2 primary backup customer 192.0.2.0/24 AS 2 Padding in this way is often used as a form of load balancing
COMMUNITY Attribute to the Rescue! AS 3: normal customer local pref is 100, peer local pref is 90 AS 1 AS 3 provider provider 192.0.2.0/24 ASPATH = 2 COMMUNITY = 3:70 192.0.2.0/24 ASPATH = 2 primary backup Customer import policy at AS 3: If 3:90 in COMMUNITY then set local preference to 90 If 3:80 in COMMUNITY then set local preference to 80 If 3:70 in COMMUNITY then set local preference to 70 customer 192.0.2.0/24 AS 2
BGP Summary • BGP4 is the protocol used on the Internet to exchange routing information between providers, and to propagate external routing information through networks. • Each autonomous network is called an Autonomous System. • ASs which inject routing information on their own behalf have ASNs.
BGP Peering • BGP-speaking routers peer with each other over TCP sessions, and exchange routes through the peering sessions. • Providers typically try to peer at multiple places. Either by peering with the same AS multiple times, or because some ASs are multi-homed, a typical network will have many candidate paths to a given prefix.
The BGP Route • The BGP route is, conceptually, a “promise” to carry data to a section of IP space. The route is a “bag” of attributes. • The section of IP space is called the “prefix” attribute of the route. • As a BGP route travels from AS to AS, the ASN of each AS is stamped on it when it leaves that AS. Called the AS_PATH attribute, or “as-path” in Cisco-speak.
BGP Route Attributes • In addition to the prefix, the as-path, and the next-hop, the BGP route has other attributes, affectionately known as “knobs and twiddles” - • weight, rarely used - “sledgehammer” • local-pref, sometimes used - “hammer” • origin code, rarely used • MED (“metric”) - a gentle nudge