1 / 19

BGP Wedgies ---- Bad Policy Interactions that Cannot be Debugged

BGP Wedgies ---- Bad Policy Interactions that Cannot be Debugged. Timothy G. Griffin Intel Research, Cambridge UK tim.griffin@intel.com http://www.cambridge.intel-research.net/~tgriffin/. NANOG 31 May 23-25, 2004. BGP Wedgie. BBP policies make sense locally

eltonc
Download Presentation

BGP Wedgies ---- Bad Policy Interactions that Cannot be Debugged

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BGP Wedgies ---- Bad Policy Interactions that Cannot be Debugged Timothy G. Griffin Intel Research, Cambridge UK tim.griffin@intel.com http://www.cambridge.intel-research.net/~tgriffin/ NANOG 31 May 23-25, 2004

  2. BGP Wedgie • BBP policies make sense locally • Interaction of local policies allows multiple global solutions • Some solutions are consistent with intended policies, and some are not • Manual intervention is required to kick the system back to an intended solution • When unintended solutions are installed, no single AS has enough global knowledge to effectively debug the problem

  3. Shedding Inbound Traffic with ASPATH Prepending Prepending will (usually) force inbound traffic from AS 1 to take primary link AS 1 provider 192.0.2.0/24 ASPATH = 2 2 2 192.0.2.0/24 ASPATH = 2 primary backup customer Yes, this is a Glorious Hack … 192.0.2.0/24 AS 2

  4. … But Padding Does Not Always Work AS 1 AS 3 provider provider 192.0.2.0/24 ASPATH = 2 192.0.2.0/24 ASPATH = 2 2 2 2 2 2 2 2 2 2 2 2 2 2 AS 3 will send traffic on “backup” link because it prefers customer routes and local preference is considered before ASPATH length! Padding in this way is often used as a form of load balancing primary backup customer 192.0.2.0/24 AS 2

  5. COMMUNITIES to the Rescue! AS 3: normal customer local pref is 100, peer local pref is 90 AS 1 AS 3 provider provider 192.0.2.0/24 ASPATH = 2 COMMUNITY = 3:70 192.0.2.0/24 ASPATH = 2 primary backup Customer import policy at AS 3: If 3:90 in COMMUNITY then set local preference to 90 If 3:80 in COMMUNITY then set local preference to 80 If 3:70 in COMMUNITY then set local preference to 70 customer 192.0.2.0/24 AS 2

  6. Don’t Celebrate Just Yet…. Provider A (Tier 1) Provider B (Tier 1) peering provider/customer provider/customer Provider C (Tier 2) customer Now, customer wants a backup link to C….

  7. Customer installs a “backup link” … Provider A (Tier 1) Provider B (Tier 1) Provider C (Tier 2) primary backup customer customer sends community that lowers local preference below a provider’s

  8. Disaster Strikes! Provider A (Tier 1) Provider B (Tier 1) Provider C (Tier 2) primary backup customer customer is happy that backup was installed …

  9. The primary link is repaired, yet routing does repair! Provider A (Tier 1) Provider B (Tier 1) This is a stable BGP routing! Provider C (Tier 2) primary backup customer One “solution” --- reset BGP session on backup link! Better --- C should translate its depref communities to those of Provider A when exporting routes to A.

  10. Ouch BELL NET CIRCUIT NET HappyPackets (Tier 2) NetNet (Tier 2) primary primary backup backup LoadBalancer P1 P2

  11. What the heck is going on? • There is no guarantee that a BGP configuration has a unique routing solution. • When multiple solutions exist, the (unpredictable) order of updates will determine which one is wins. • There is no guarantee that a BGP configuration has any solution! • And checking configurations NP-Complete • Complex policies (weights, communities setting preferences, and so on) increase chances of routing anomalies. • … yet this is the current trend!

  12. More fun with communities …. Provider C (Tier 2) Provider D (Tier 2) Provider A (Tier 1) Provider B (Tier 1) primary backup I backup II customer backup II: customer sends community that lowers preference below peer’s but above provider’s

  13. Primary goes down! Provider C (Tier 2) Provider D (Tier 2) Provider A (Tier 1) Provider B (Tier 1) primary backup I backup II customer

  14. Primary repaired Provider C (Tier 2) Provider D (Tier 2) Provider A (Tier 1) Provider B (Tier 1) primary backup I backup II customer

  15. Reset Backup I session?First, take it down…. Provider C (Tier 2) Provider D (Tier 2) Provider A (Tier 1) Provider B (Tier 1) primary backup I backup II customer

  16. Now Bring it up … Provider C (Tier 2) Provider D (Tier 2) Provider A (Tier 1) Provider B (Tier 1) primary backup I backup II customer “solution” --- reset BGP session on BOTH backup links simultaneously!

  17. BGP Wedgie • BBP policies make sense locally • Interaction of local policies allows multiple global solutions • Some solutions are consistent with intended policies, and some are not • Manual intervention is required to kick the system back to an intended solution • When unintended solutions are installed, no single AS has enough global knowledge to effectively debug the problem

  18. Recommendations

  19. Recommendations • Interdomain communities that can tweak a route’s preference should be defined with care and consistenty implemented.

More Related