530 likes | 666 Views
Crafting Confederations An overview of the Confederation POP Approach to Network Architecture. Dan Golding NetRail, Inc. dan@netrail.net. Miguel Dimayuga Earthlink, Inc. mdimayuga@corp.earthlink.net. The Old Way…. Conventional Network Routing Architectures….
E N D
Crafting ConfederationsAn overview of the Confederation POP Approachto Network Architecture Dan Golding NetRail, Inc. dan@netrail.net Miguel Dimayuga Earthlink, Inc. mdimayuga@corp.earthlink.net
The Old Way… Conventional Network Routing Architectures…. • Full Mesh iBGP or Route Reflectors • A fully meshed Network via ATM PVCs.
What’s Wrong With The Old Way? • It’s not adapted to the New Optical Network! • POS is here in force, ATM’s value in the core is receding. • It is far more fragile, and far less agile than newer methods of Inter-domain Routing. • The Old Way was prone to user-error. The E-Commerce Revolution demands a New Way!
A Better Way • Emphasizes Large Scale, IP Based, Fiber Ring Networks • Optimized for Service Provider Needs • Utilizes cutting edge routing technologies to provide far greater fault tolerance and usable traffic engineering. • Implemented via advanced BGP techniques: Communities and Confederations.
How the Old worked…(Full Mesh iBGP) • Every router must be fully meshed with all others. • Works well in small systems • Grows exponentially • Eventually consumes all CPU, memory, and engineering resources. Full iBGP Mesh Exponential growth!
How the Old Way worked…(Route Reflectors) • Scaled Well • Well suited to fully meshed ATM Networks – Star Topology. but... • Not Survivable in a Fiber Ring Network. Peer Isolation with BGP Route Reflection Peers RR Client RR Server Peers
How the Old Way worked…(Filtering) • List of IP Prefixes and/or AS numbers set on all border routers to other ISPs. Only the access-list contents would be advertised. • Worked well when most customers were single-homed and didn’t run BGP. • Changes were VERY manpower intensive. • With multi-homed e-commerce shops, no longer feasible.
How the New Way works…(Confederations) BGP Confederations • Routers peer with neighbors • Highly Survivable • Very Scalable • Easily Configured • Aids Troubleshooting Routers Peer with Neighbors Peers Peers
Confederation Overview • BGP allows three types of peer relationships: • iBGP (Full iBGP mesh) • eBGP (External Peering or Transit) • Confederation eBGP (its an iBGP with an eBGP look!) • Confederation eBGP is like regular eBGP, except • Next Hop, Local Preference and MEDs are preserved • Confederation elements in the AS-PATH are not counted for route selection purposes
Confederation Overview • Confederations allow groups of routers to form “sub-autonomous systems” to eliminate scaling problems with full mesh iBGP • All Routers within a sub-AS must be fully meshed (or optionally in a route reflector cluster configuration) • Confederations are most advantageous when there are few routers per sub-AS. There is no reason to limit the number of sub-AS’s you have – nothing is gained.
Confederation Overview • Most confederation designs start out with only two or three sub-ASes. This offers few advantages over full mesh iBGP in a ring network topology. • The more sub-ASes you add, the greater the advantage • The final result: One sub-AS per POP • The upper limit on this is 1000 sub-AS’s per RFC
The Advantages of a Confederation of POPs • The routers within each POP need only peer with each other, utilizing iBGP • Neighboring POPs are peered with via POP border routers speaking confederation eBGP • Next Hop, Local Pref and MEDs are preserved • More survivable than Route Reflectors • Far more scalable than full iBGP mesh
How to Make It Work • Thoughtful use of sub-AS numbers • Local Preference Hierarchy • Useful and Descriptive Community Strings • Meaningful MEDs • Use of various policies – via access lists, community lists, etc – as building blocks • Use of Peer Groups whenever implementation allows.
Sub-AS Assignment • Sub-AS’s become useful tools for debugging – show ip bgp, show route • Suggested assignment is geographical • Always remember to keep room for expansion! • Put plenty of extra sub-AS’s in your configs – don’t count on adding them later!
Geographical Region as sub-AS • Southeast 65000-65099 • Northeast 65100-65199 • Northcentral 65200-65299 • Southcentral 65300-65399 • Western 65400-65499 • Canadian 65500-65535 • Latin/South American 64512-64599 • European 64600-64699 • Asian 64700-64799 • Reserved 64800-64999
Community Strings are the Key • Communities are “tags” or “post-it notes” attached to routes to help identify them. • There can be more than one community attached to a route. • Communities are recommended to be set at the ingress point. • Communities need be applied only once • administrative burden and complexity is greatly reduced. • When routes egress, filtering can be based on one or more community strings. • Sample Communities – Regional, by Peer, Customer, Internal, Peer, Transit
Communities Set at Ingress transit router bgp 4355 network 207.69.0.0/16 route-map make-green network 199.174.166.0/24 route-map make-red 4.0.0.0/8 i 5.0.0.0/8 i AS701 router bgp 4355 neighbor a.a.a.a remote-as 701 neighbor a.a.a.a route-map make-blue in 207.69.0.0/16 i 198.99.146.0/24 i 4.0.0.0/8 701 i 5.0.0.0/8 701 i AS4355
Communities Used to Filter on Egress transit 4.0.0.0/8 i 5.0.0.0/8 i AS701 router bgp 4355 neighbor b.b.b.b remote-as 3703 neighbor b.b.b.b route-map blue-green out 207.69.0.0/16 i 198.99.146.0/24 i 4.0.0.0/8 701 i 5.0.0.0/8 701 i customer 4.0.0.0/8 701 4335 i 5.0.0.0/8 701 4335 i 207.69.0.0/16 4335 i AS4355 AS3703
Community Categories – Route Type • Customer Routes 4006:65150 • Private Peering 4006:65140 • Transit 4006:65130 • Public Peering 4006:65120 • Internal Routes (OPN-visible) 4006:65110 • Internal Routes (Global-visible) 4006:65100
Other Peoples Networks (OPNs) • To expand our national coverage, Mindspring utilized third party networks’ dialup facilities. These networks are what we term as OPNs. • Prefixes for Core Services which we want restricted to MindSpring customers and not visible to the rest of the world (e.g. news, radius, smtp) are announced to our OPNs alone. • This has the added advantage of protecting against abuse of our services by non-customers. • With communities, we can tag routes for export to OPNs alone.
Community Categories – Route Ingress Location • Field Peering 4006:65020 • Exchange Point Peer 4006:65010 • Northeast Region Peering (DC) 4006:65030 • Southeast Region Peering (Atlanta) 4006:65040 • Northcentral Region Peering (Chicago) 4006:65050 • West Peering Region (Palo Alto) 4006:65060 • Southcentral Region Peering (Dallas) 4006:65070
Community Categories – Specials • No Export to any external BGP peer No-Export • Do Not Advertise to any peer (Well Known) No-Advertise • Always Prefer (proposed Well Known) Prefer-Me (65535:65519) • Always Avoid (proposed Well Known) Avoid-Me (65535:65504)
Community Categories – Origin AS Also add a community string for the origin AS If the route comes from UUNet, then add 4006:701 If the route comes from Sprint, then add 4006:1239
Local Preference transit peering 165.200.1.0/24 1239 3703 i 165.200.1.0/24 1 3703 i AS701 AS4006 router bgp 4355 neighbor a.a.a.a remote-as 3703 neighbor a.a.a.a route-map setlocpref100 in router bgp 4355 neighbor c.c.c.c remote-as 701 neighbor c.c.c.c route-map setlocpref60 in router bgp 4355 neighbor b.b.b.b remote-as 4006 neighbor b.b.b.b route-map setlocpref90 in 165.200.1.0/24 100 3703 i customer 165.200.1.0/24 90 4006 3703 i 165.200.1.0/24 60 701 3703 i 165.200.1.0/24 i AS4355 AS3703
Local Preference Hierarchy • The higher the Local Preference, the more desirable the route. • Customers ALWAYS come first – we never want to send their traffic to a peer, regardless of AS-Path padding • Private Peering is always more desirable than Public Peering • Transit is less desirable than private peering for economic reasons
Local Preference Hierarchy • Always Preferred 250 • Customer Routes 100 • Customer Backup Routes 90 • Private Peering 80 • Less Preferred Private Peering (congested) 70 • Paid Transit 60 • Less Preferred Paid Transit (congested) 50 • Public Peering (ATM NAPs) 40 • Less Preferred Public Peering (FDDI NAPs) 30 • Never Preferred 1
Peer Types • Local sub-AS Peer (within a POP) • Confederation Peers (other POPs or sub-ASes) • Transit Peers (we buy transit from them) • Public/Private Peering • Customer Peers
Local sub-AS Peers • All peers within a POP are members of this group. • The update source for these BGP sessions will be the loopback address of the router. • Communities must be recognized. • Option to use full-mesh or route-reflectors. For Each Local Sub-AS Peer neighbor <neigh-ip A> remote-as <neighbor-as A> neighbor <neigh-ip A> description otherlocalroutername neighbor <neigh-ip A> update-source loopback0 neighbor <neigh-ip A> send-community neighbor <neigh-ip A> version 4
Update-Source Loopback Address • The routers will use loopback address as the source of the bgp packets. • Only one session needs to be created even with multiple paths between routers. • Peering between loopback addresses increase the stability of the bgp sessions since loopback addresses don’t go down. 207.69.132.1/24 207.69.132.2/24 192.168.128.2/32 192.168.128.1/32 207.69.133.1/24 207.69.133.2/24
Confederation Peers • All peers that are POP border routers are members of this group. • The update source for these BGP sessions will be the facing interface of the router. • Inbound Soft Reconfiguration is not necessary. • Outbound soft reconfiguration can be done at the remote end • Communities must be recognized. • Filtering is done on egress, MEDs are set on ingress.
Soft Reconfiguration • “clear ip bgp” drops the TCP session. Soft reconfiguration is much friendlier. • “clear ip bgp <neighbor-ip> soft out” issues withdrawals for all advertised routes, recomputes and then resends the routes (low cpu) • “clear ip bgp <neighbor-ip> soft in” reevaluates routes received from its peers stored in memory. (high memory requirements)
Confederation Peer Configuration Peer-Group neighbor internal peer-group neighbor internal version 4 neighbor internal send-community For Each Peer neighbor <neigh-ip A> remote-as <neighbor-as A> neighbor <neigh-ip A> description remotesitename neighbor <neigh-ip A> route-map <site>-recv-<remotesite> in neighbor <neigh-ip A> route-map <site>-send-<remotesite> out neighbor <neigh-ip A> peer-group internal route-map <site>-recv-<remotesite> permit 10 set metric +<metric> route-map <site>-send-<remotesite> permit 10 match community <send-all-except-no-advertise-routes>
Confederation Peer Routes • Don’t Send: No Advertise • Send: Customer, Peer, Transit, Internal
Additive MEDs • Why • Allows a tiebreaker based on optimum routing • Allows an alternate method to de-prefer routes in case of transit/peering congestion • Possible Values – • Mileage • delay in ms • fixed value per hop • Supported by - • Cisco IOS • Feature Request in JUNOS, Riverstone, Foundry IronWare
Additive MEDs in Confederations 207.69.0.0/16 120 (65000) 580 120 40 600 207.69.0.0/16 700 (65012 65000) 207.69.0.0/16 0 (originated here) 207.69.0.0/16 760 (65401 65012 65000) 207.69.0.0/16 720 (65012 65000) 207.69.0.0/16 740 (65400 65012 65000)
Transit Peers • The update source for these BGP sessions will be the facing interface address of the router. • Soft Reconfiguration should be used. • Communities must be recognized. • Send out only customer and internal routes. • Apply an import ACL to the routes that prevents reception of martian routes, and assigns proper communities and local preference. • Allows prepending certain subsets of routes with additional AS numbers.
Transit Peer Config neighbor <neighbor-ip> send-community neighbor <neighbor-ip> version 4 neighbor <neighbor-ip> next-hop-self neighbor <neighbor-ip> soft-reconfiguration inbound neighbor <neighbor-ip> distribute-list martians in neighbor <neighbor-ip> remote-as <neighbor-as C> neighbor <neighbor-ip> route-map <site>-recv-<provider> in neighbor <neighbor-ip> route-map <site>-send-<provider> out neighbor <neighbor-ip> description transitprovidername route map <site>-send-<provider> deny 10 match community 4 route map <site>-send-<provider> permit 20 match community 1 set as-path prepend 4006 4006 route-map <site>-recv-<provider> permit 10 set local-preference 60 set metric 0 (if you don’t want to listen to others meds) Set community 4006:30 additive Set community 4006:20 additive Set community 4006:500 additive Set community 4006:<AS#> additive
Transit Peer Config • Don’t Send: No Exports, No Advertise • Peers or Transit • Send: Customers, Internal
Transit Tricks • De-prefer routes for congested outbound • Set Local Pref normally for routes with AS-Path Length=1 or 2 • Set Local Pref Lower for all other routes • Effect: Only most direct routes flow through that connection. Others flow through other transit, if available • OPN’s and sending OPN routes • Send special routes – usually for servers and services – only to your own network, and OPNs • Have a special community list or policy specifying the routes.
Private/Public Peers • The update source for these BGP sessions will be the facing interface address of the router. • Soft Reconfiguration should be used. • Communities must be recognized. • Send out only customer and internal routes. • Apply an import ACL to the routes that prevents reception of martian routes, and assigns proper communities and local preference. • Option to use local preference to prefer unconditionally all or only some routes coming from a free peer.
Peer Configuration neighbor free-peering peer-group neighbor free-peering send-community neighbor free-peering version 4 neighbor free-peering next-hop-self neighbor free-peering-full soft-reconfiguration inbound neighbor free-peering-full distribute-list martians in neighbor free-peering route-map <peername>-in in neighbor free-peering route-map cust-routes out route map cust-routes deny 5 match community-list 4 route-map cust-routes permit 10 match community-list 1 route-map <peername>-in permit 10 set local-preference 80 set community 4006:30 additive set community 4006:20 additive set community 4006:700 additive set community 4006:<AS#> additive Per-Peer neighbor <neighbor-ip> remote-as <neighbor-as D> neighbor <neighbor-ip> peer-group free-peering neighbor <neighbor-ip> description Peer Name
Free Peering Routes • Don’t Send: No Exports, No Advertise • Peers or Transit • Send: Customers, Internal
Customer Peers • The update source for these BGP sessions will be the facing interface address of the router. • Soft Reconfiguration should be used. • Communities must be recognized. This includes communities sent from customers. • Send out selected routes, based on customer request. • Apply an import ACL to the routes that prevents reception of martian routes, and assign proper communities and local preference. • The import filter must also accept only specific customer routes. • We recommend using Rtconfig to query RADB and generate the ACLs.
What Type of Routes Can We Send? • Full Routes • Customer, Peers, Internals, Transit. • AKA “A Full View” • Customer Routes • Customer and Internal Routes. • Good for weaker routers (Cisco) • AKA “A Partial View” • Default Route • Send only a default route - 0.0.0.0/0, pointed to the router interface • Limited utility
Special Considerations for Customers • Carefully Filter routes – the farther downstream you get, the less clueful (generally) • Filtering can be based on AS or Prefix • The generally accepted practice is to filter by IP Access List at ingress (use radb tools if possible) • Customers do not have to advertise the same routes everywhere – peers do!
Customer Configuration – Full Routes policy-statement atl-myco { from { route-filter 209.49.143.0/24 exact accept; route-filter 199.5.0.0/16 exact accept; } then reject bgp { group <location-customername> { type external; description <peer-name>; peer-as <neighbor AS #>; neighbor <ip address>; import <customername>-in; } } policy-options { policy-statement <customername>-in { term term1 { from policy <location-customername>; then { local-preference 100; nexthop self; community + customer; community + field community + ATL; community + <customername>; } } }
Customer Configuration – Partial Routes bgp { group <location-customername> { type external; description <peer-name>; peer-as <neighbor AS #>; neighbor <ip address>; import <customername>-in; export custroutes; } } policy-options { policy-statement <customername>-in { term term1 { from policy <location-customername>; then { local-preference 100; nexthop self; community + customer; community + field; community + ATL; community + <customername>; } } } policy-statement atl-myco { from { route-filter 209.49.143.0/24 exact accept; route-filter 199.5.0.0/16 exact accept; } then reject policy-statement custroutes { term term1 { from community [no-export no-advertise]; then reject; } term term2 { from community [internal customer custback]; then accept; }
Default Route Only • Cisco – neighbor a.b.c.d default-originate • Juniper - A little more complex... bgp { group <location-customername> { type external; description <peer-name>; peer-as <neighbor AS #>; neighbor <ip address>; import <customername>-in; export default-originate; } } routing-options { static { route 0.0.0.0/0 { nexthop <loopback address>; no-install; } } policy-statement default-originate { from route-filter 0.0.0.0/0; then { nexthop self; accept; }
Question and Answer • Confederations • General BGP Questions