450 likes | 555 Views
Today’s Big Picture. Large ISP. Large ISP. Stub. Small ISP. Dial-Up ISP. Access Network. Stub. Stub. Large number of diverse networks. Internet AS Map: caida.org. Autonomous System(AS). Internet is not a single network Collection of networks controlled by different administrations
E N D
Today’s Big Picture Large ISP Large ISP Stub Small ISP Dial-Up ISP Access Network Stub Stub Large number of diverse networks
Autonomous System(AS) • Internet is not a single network • Collection of networks controlled by different administrations • An autonomous system is a network under a single administrative control • An AS owns an IP prefix • Every AS has a unique AS number • ASes need to inter-network themselves to form a single virtual global network • Need a common protocol for communication
R R2 R1 R3 Who speaks Inter-AS routing? AS2 AS1 BGP border router internal router • Two types of routers • Border router(Edge), Internal router(Core) • Two border routers of different ASes will have a BGP session
Intra-AS vs Inter-AS • An AS is a routing domain • Within an AS: • Can run a link-state routing protocol • Trust other routers • Scale of network is relatively small • Between ASes: • Lack of information about other AS’s network (Link-state not possible) • Crossing trust boundaries • Link-state protocol will not scale • Routing protocol based on route propagation
… the administration of an AS appears to other ASes to have a single coherent interior routing plan and presents a consistent picture of what networks are reachable through it. RFC 1930: Guidelines for creation, selection, and registration of an Autonomous System Autonomous Systems (ASes) • An autonomous system is an autonomous routing domain that has been assigned an Autonomous System Number (ASN). • All parts within an AS remain connected.
IP Address Allocation and Assignment: Internet Registries IANA www.iana.org APNIC www.apnic.org ARIN www.arin.org RIPE www.ripe.org Allocate to National and local registries and ISPs Addresses assigned to customers by ISPs RFC 2050 - Internet Registry IP Allocation Guidelines RFC 1918 - Address Allocation for Private Internets RFC 1518 - An Architecture for IP Address Allocation with CIDR
AS Numbers (ASNs) ASNs are 16 bit values. 64512 through 65535 are “private” Currently over 11,000 in use. • Genuity: 1 • MIT: 3 • Harvard: 11 • UC San Diego: 7377 • AT&T: 7018, 6341, 5074, … • UUNET: 701, 702, 284, 12199, … • Sprint: 1239, 1240, 6211, 6242, … • … ASNs represent units of routing policy
Nontransit vs. Transit ASes Internet Service providers (ISPs) have transit networks ISP 2 ISP 1 NET A Nontransit AS might be a corporate or campus network. Could be a “content provider” Traffic NEVER flows from ISP 1 through NET A to ISP 2
Selective Transit NET B NET C NET A provides transit between NET B and NET C and between NET D and NET C NET A NET A DOES NOT provide transit Between NET D and NET B NET D Most transit ASes allow only selective transit key impact of commercialization
provider customer IP traffic Customers and Providers provider customer Customer pays provider for access to the Internet
Customer-Provider Hierarchy IP traffic provider customer
peer peer provider customer The Peering Relationship Peers provide transit between their respective customers Peers do not provide transit between peers Peers (often) do not exchange $$$ traffic allowed traffic NOT allowed
BGP-4 • BGP = Border Gateway Protocol • Is a Policy-Based routing protocol • Is the de facto EGP of today’s global Internet • Relatively simple protocol, but configuration is complex and the entire world can see, and be impacted by, your mistakes. • 1989 : BGP-1 [RFC 1105] • Replacement for EGP (1984, RFC 904) • 1990 : BGP-2 [RFC 1163] • 1991 : BGP-3 [RFC 1267] • 1995 : BGP-4 [RFC 1771] • Support for Classless Interdomain Routing (CIDR)
BGP Operations (Simplified) Establish session on TCP port 179 AS1 BGP session Exchange all active routes AS2 While connection is ALIVE exchange route UPDATE messages Exchange incremental updates
Four Types of BGP Messages • Open : Establish a peering session. • Keep Alive : Handshake at regular intervals. • Notification : Shuts down a peering session. • Update : Announcing new routes or withdrawing previously announced routes. announcement = prefix + attributes values
What is Routing Policy • Policy refers to arbitrary preference among a menu of available routes (based upon routes’ attributes) • Public description of the relationship between external BGP peers • Can also describe internal BGP peer relationship • Eg: Who are my BGP peers • What routes are • Originated by a peer • Imported from each peer • Exported to each peer • Preferred when multiple routes exist • What to do if no route exists?
Routing Policy Example • AS1 originates prefix “d” • AS1 exports “d” to AS2, AS2 imports • AS2 exports “d” to AS3, AS3 imports • AS3 exports “d” to AS5, AS5 imports
Routing Policy Example (cont) • AS5 also imports “d” from AS4 • Which route does it prefer? • Does it matter? • Consider case where • AS3 = Commercial Internet • AS4 = Internet2
Import and Export Policies • Inbound filtering controls outbound traffic • filters route updates received from other peers • filtering based on IP prefixes, AS_PATH, community • Outbound Filtering controls inbound traffic • forwarding a route means others may choose to reach the prefix through you • not forwarding a route means others must use another router to reach the prefix • Attribute Manipulation • Import: LOCAL_PREF (manipulate trust) • Export: AS_PATH and MEDs
Attributes are Used to Select Best Routes 192.0.2.0/24 pick me! 192.0.2.0/24 pick me! 192.0.2.0/24 pick me! Given multiple routes to the same prefix, a BGP speaker must pick at most one best route (Note: it could reject them all!) 192.0.2.0/24 pick me!
BGP Policy Knob: Attributes Value Code Reference ----- --------------------------------- --------- 1 ORIGIN [RFC1771] 2 AS_PATH [RFC1771] 3 NEXT_HOP [RFC1771] 4 MULTI_EXIT_DISC [RFC1771] 5 LOCAL_PREF [RFC1771] 6 ATOMIC_AGGREGATE [RFC1771] 7 AGGREGATOR [RFC1771] 8 COMMUNITY [RFC1997] 9 ORIGINATOR_ID [RFC2796] 10 CLUSTER_LIST [RFC2796] 11 DPA [Chen] 12 ADVERTISER [RFC1863] 13 RCID_PATH / CLUSTER_ID [RFC1863] 14 MP_REACH_NLRI [RFC2283] 15 MP_UNREACH_NLRI [RFC2283] 16 EXTENDED COMMUNITIES [Rosen] ... 255 reserved for development We will cover a subset of these attributes Not all attributes need to be present in every announcement From IANA: http://www.iana.org/assignments/bgp-parameters
BGP Route Processing Apply Policy = filter routes & tweak attributes Apply Policy = filter routes & tweak attributes Receive BGP Updates Based on Attribute Values Best Routes Transmit BGP Updates Apply Import Policies Best Route Selection Best Route Table Apply Export Policies Install forwarding Entries for best Routes. IP Forwarding Table
Import and Export Policies • For inbound traffic • Filter outbound routes • Tweak attributes on outbound routes in the hope of influencing your neighbor’s best route selection • For outbound traffic • Filter inbound routes • Tweak attributes on inbound routes to influence best route selection outbound routes inbound traffic inbound routes outbound traffic In general, an AS has more control over outbound traffic
Adj RIB In Main BGP RIB Adj RIB Out Incom- ing Outgo- ing IGPs Main RIB/ FIB Static & HW Info Policy Implementation Flow
Conceptual Model of BGP Operation • RIB : Routing Information Base • Adj-RIB-In: Prefixes learned from neighbors. As many Adj-RIB-In as there are peers • Loc-RIB: Prefixes selected for local use after analyzing Adj-RIB-Ins. This RIB is advertised internally. • Adj-RIB-Out : Stores prefixes advertised to a particular neighbor. As many Adj-RIB-Out as there are neighbors
Path Attributes: ORIGIN • ORIGIN: • Describes how a prefix came to BGP at the origin AS • Prefixes are learned from a source and “injected” into BGP: • Directly connected interfaces, manually configured static routes, dynamic IGP or EGP • Values: • IGP (EGP): Prefix learnt from IGP (EGP) • INCOMPLETE: Static routes
Path Attributes: AS-PATH • List of ASsthru which the prefix announcement has passed. AS on path adds ASN to AS-PATH • Eg: 138.39.0.0/16 originates at AS1 and is advertised to AS3 via AS2. • Eg: AS-SEQUENCE: “100 200” • Used for loop detection and path selection AS1 (100) AS3 (15) 138.39.0.0/16 AS2 (200)
Traffic Often Follows ASPATH 135.207.0.0/16 ASPATH = 3 2 1 AS 1 AS 3 AS 4 AS 2 135.207.0.0/16 IP Packet Dest = 135.207.44.66
… But It Might Not AS 2 filters all subnets with masks longer than /24 135.207.0.0/16 ASPATH = 1 135.207.0.0/16 ASPATH = 3 2 1 135.207.44.0/25 ASPATH = 5 AS 1 AS 3 AS 4 AS 2 135.207.0.0/16 IP Packet Dest = 135.207.44.66 From AS 4, it may look like this packet will take path 3 2 1, but it actually takes path 3 2 5 AS 5 135.207.44.0/25
Shorter AS-PATH Doesn’t Mean Shorter # Hops BGP says that path 4 1 is better than path 3 2 1 Duh! AS 4 AS 3 AS 2 AS 1
ASPATH Padding: Shed inbound traffic AS 1 provider 192.0.2.0/24 ASPATH = 2 2 2 192.0.2.0/24 ASPATH = 2 Padding will (usually) force inbound traffic from AS 1 to take primary link primary backup customer 192.0.2.0/24 AS 2
Padding May Not Shut Off All Traffic AS 1 AS 3 provider provider 192.0.2.0/24 ASPATH = 2 192.0.2.0/24 ASPATH = 2 2 2 2 2 2 2 2 2 2 2 2 2 2 AS 3 will send traffic on “backup” link because it prefers customer routes and local preference is considered before ASPATH length! Padding in this way is often used as a form of load balancing primary backup customer 192.0.2.0/24 AS 2
Deaggregation + Multihoming If AS 1 does not announce the more specific prefix, then most traffic to AS 2 will go through AS 3 because it is a longer match 12.2.0.0/16 12.2.0.0/16 12.0.0.0/8 AS 3 AS 1 provider provider AS 2 customer 12.2.0.0/16 AS 2 is “punching a hole” in the CIDR block of AS 1=> subverts CIDR
BGP Table Growth Thanks: Geoff Huston. http://www.telstra.net/ops/bgptable.html
Large BGP Tables Considered Harmful • Routing tables must store best routes and alternate routes • Burden can be large for routers with many alternate routes (route reflectors for example) • Routers have been known to die • Increases CPU load, especially during session reset
ASNs Growth From: Geoff Huston. http://www.telstra.net/ops
Dealing with ASN growth… • Make ASNs larger than 16 bits • How about 32 bits? • See Internet Draft: “BGP support for four-octet AS number space” (draft-ietf-idr-as4bytes-03.txt) • Requires protocol change and wide deployment • Change the way ASNs are used • Allow multihomed, non-transit networks to use private ASNs • Uses ASE (AS number Substitution on Egress ) • See Internet Draft: “Autonomous System Number Substitution on Egress” (draft-jhaas-ase-00.txt) • Works at edge, requires protocol change (for loop prevention)
A Few Bad Apples … Most prefixes are stable most of the time. On this day, about 83% of the prefixes were not updated. Typically, 80% of the updates are for less than 5% Of the prefixes. Percent of BGP table prefixes Thanks to Madanlal Musuvathi for this plot. Data source: RIPE NCC
Route Flap Dampening (RFC 2439) Routes are given a penalty for changing. If penalty exceeds suppress limit, the route is dampened. When the route is not changing, its penalty decays exponentially. If the penalty goes below reuse limit, then it is announced again. • Can dramatically reduce the number of BGP updates • Requires additional router resources • Applied on eBGP inbound only
route dampened for nearly 1 hour Route Flap Dampening Example penalty for each flap = 1000
How Long Does BGP Take to Adapt to Changes? From: Abha Ahuja and Craig Labovitz
Two Main Factors in Delayed Convergence • Rate limiting timer slows everything down • BGP can explore many alternate paths before giving up or arriving at a new path • No global knowledge in vectoring protocols
Implementation Does Matter! stateless withdraws widely deployed stateful withdraws widely deployed