200 likes | 216 Views
Learn about the impact and solutions for managing Internet routing table size growth with minimal customer impact and enhanced global reachability using prefix filtering strategies.
E N D
BGP filtering André Chapuis, chapuis@ip-plus.net
Internet routing table size • Do we really need these 120’000 routes ? • Number of contiguous prefixes with same origin/path * 4.0.0.0 209.10.12.125 8204 0 4513 3356 i *> 4.6.0.0/22 66.185.128.48 514 0 1668 3356 10753 i *> 4.6.4.0/23 66.185.128.48 514 0 1668 3356 10753 i … 50 prefixes with same origin… *> 4.6.172.0/22 66.185.128.48 514 0 1668 3356 10753 i *> 4.6.176.0/22 66.185.128.48 514 0 1668 3356 10753 I * 65.37.128.0/18 134.222.85.45 0 0 286 209 3356 4355 i * 65.37.136.0/23 134.222.85.45 0 0 286 209 3356 4355 I .. 20 prefixes with same origin… * 65.37.220.0/23 134.222.85.45 0 0 286 209 3356 4355 i
Impact of Internet routing table size growth • Router memory (with 125’000 routes) • BGP table memory (21MB) • Routing table memory (21MB) • CEF table memory (21MB) • Distributed on every line card (limit=smallest card) • Second BGP feed (+10M – 20M) • Still many Cisco 7206 with NPE-150: 128MB RAM is a maximum • Crash experience with 128MB and two full feeds on a CPE • Router CPU • More updates -> more activity
Requirements • Solution with minimal (no) impact on customers • No routing holes = global reachability is granted • Multihomed customers must keep all BGP resiliency • Minimal manual tuning wanted • No frequent changes
Solution chosen • Prefix-filtering • RIR minimal allocation sizes • Historical classfull addresses (A and B) • Ad-hoc filters based on size / region • Semi-default routes • To guarantee reachability in case of misconfiguration • Exceptions • Customer prefixes • Chosen prefixes (private peerings) • Swiss peerings
Prefix filtering (1) • RIR minalloc: • http://www.apnic.net/db/min-alloc.html • http://www.arin.net/statistics/index.html#cidr • http://www.ripe.net/ripe/docs/smallest-alloc-sizes.html • Ex: /19 within 62/8 • Changes needed only when IANA allocates e new block to a RIR -> not too frequent (every 3-6 month) • Historical ‘Classful’ address-space: • Class B: /22 • Class A: /21
Prefix filtering (2) • Ad-hoc: • 199/8, ARIN region, default /22 with exceptions • 200/7, LACNIC region, default /22 with exceptions • 202/7, APNIC region, default /22 but 202/10 is /24 • 204/6, ARIN region, default /22 with exceptions • Current table size within AS3303: • 60’793 as seen from Oregon-IX • 63’147 as seen internally (customer more-specifics) • 125’000 average for ISPs not filtering
Prefix filtering (3) • Filter example … ip prefix-list martians seq 40000 permit 40.0.0.0/8 le 21 ip prefix-list martians seq 43000 permit 43.0.0.0/8 le 21 ip prefix-list martians seq 44000 permit 44.0.0.0/6 le 21 ip prefix-list martians seq 48000 permit 48.0.0.0/5 le 21 ip prefix-list martians seq 56000 permit 56.0.0.0/7 le 21 ip prefix-list martians seq 60000 permit 60.0.0.0/7 le 20 ip prefix-list martians seq 62000 permit 62.0.0.0/7 le 19 …
Semi-default routes (1): the problem • Some end-users (or ISPs) get an allocated block from a RIR (say /18), but announce only a part of it (say a /23) without aggregate ! • Example: • 62.61.192.0/23 4513 701 6453 i • ALLOCATED PA is 62.61.192.0/18 -> not routed • Network not reachable • The responsible is the owner of the block/source ISP • But there are so many cases like that. • Therefore we use semi-default routes
Semi-default routes (2) • Aggregates created to cover RIR space: • 62/8, 80/7, 212/7, 217/8 routed towards EU transit ISP • ARIN/APNIC/LACNIC space towards US transit • Class A/B • Class B: 128/3, 160/5 and 168/6 towards US transit • No semi-default for class A • Aggregates announced to customers • Tagged with a special community (3303:9999)
Semi-default routes (3) • = Static routes redistributed into BGP ip route 62.0.0.0 255.0.0.0 POS3/1 router bgp 65000 network 62.0.0.0 route-map semi-default • Original idea was to ask our transit ISP to send us them via BGP • Upstream ISP reluctant to the original idea (particularly the USA ones…) • We provide them to our customers
Exceptions. We don’t filter for: • Some private peerings with fair amount of traffic • Google, Yahoo, Hotmail • Customer prefixes • Accept anything from customers (up to /24) • Prefixes with an origin AS included within our as-set must be accepted to guarantee reachability • Swiss routes (= routes received on CH-peerings in CH) • Routes received from CH-peers are not subject to the filters • Because there are few of them • And we are a swiss ISP
Customer prefixes (configuration) route-map set-ipp-peer permit 10 match as-path 198 ! route-map set-ipp-peer permit 20 match ip address prefix-list martians ! ip as-path access-list 198 permit _(AS-SWCMGLOBAL)$ ! ip prefix-list martians seq 3000 permit 3.0.0.0/8 le 21 ip prefix-list martians seq 4000 permit 4.0.0.0/8 le 21 ip prefix-list martians seq 6000 permit 6.0.0.0/8 le 21 ip prefix-list martians seq 8000 permit 8.0.0.0/7 le 21
Results (1) • BGP Updates/min before and after the filter
Results (2) • Stability improved • Number of updates/minute reduced by 40% • Last month de-aggregation of Bellsouth • About 1000 more prefixes injected • Transparent for AS3303 • Traffic engineering done by ISPs outside CH with more-specifics from PA blocks is ignored by AS3303 • Forced ‘traffic engineering’ neglectible • Small amount of traffic following the semi-defaults routes • 204.0.0.0/6 has less than 500kb/s average traffic • For a total of 10’000 prefixes
Other ISPs filtering • Verio AS2914 • Class A space (i.e., 0/1), accept /22 and shorter • Class B space (i.e., 128/2), accept /22 and shorter • Class C space (i.e., 192/3), accept /24 and shorter • SWITCH AS559 • RIR minalloc + /19 in ClassA/B • Jippi (Eunet Finland) AS6667 • 192/7 : accept /24 and shorter • Rest: accept /21 and shorter
Conclusions • Less memory needed (and CPU) • No reachability issues with semi-default routes • BGP customers satisfied • …lots of ‘useless’ routes in the Internet… • Need to have at least one transit provider • Method does not work for Tier-1 (transit-free ISPs) • Good solution for (small) ISPs with limited memory budget