1 / 15

Ethernet Data Center Routing Challenges and 802.1aq/SPB new work PETER ASHWOOD-SMITH peterashwoodsmith@huawei.com

Ethernet Data Center Routing Challenges and 802.1aq/SPB new work PETER ASHWOOD-SMITH peterashwoodsmith@huawei.com. A) Tweak Bridge Priorities Here. B). S 1 … S 16. 802.1aq’s 16 ECT can give perfect spread going 2 hops 16 uplinks. However:

darnell
Download Presentation

Ethernet Data Center Routing Challenges and 802.1aq/SPB new work PETER ASHWOOD-SMITH peterashwoodsmith@huawei.com

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ethernet Data Center Routing Challengesand 802.1aq/SPB new work PETER ASHWOOD-SMITH peterashwoodsmith@huawei.com

  2. A) TweakBridgePrioritiesHere B) S1 … S16 802.1aq’s 16 ECT can give perfect spread going 2 hops 16 uplinks. However: A) Need to tweak 2nd layer switch priorities to guarantee all 16 are used. B) Need at least 16 subnets (C/S-Vlan’s) to assign one per 802.1aq B-VID.

  3. Can we eliminate ‘tweaking*’ • David Allan et al. have a presentation on this so I won’t spend much time on it. • In general a network with N equal cost paths from ‘some source’ to ‘some destination’ requires #ECT about 25-40% greater than N (to statistically capture them all). • Therefore when #ECT == N some ‘tweaking’ is usually required (for DC its trivial to do however). • Dave et al. suggest non-independence between ECT algorithms as way to address this (maximize diversity) … *Tweaking = adjustingBridge Priorities up/down fromdefaults.

  4. A1 A2 B1 B2 B3 B4 S1,1 S32,1 S3,1 S1,160 S32,160 S3,160 “Example” 802.1aq switching cluster – assume 100GE NNI links/groups A15 A16 Goodnumbers“16” & “2”levels. 32 x 100GE 16 x 32 x 100GE = 51.2T using 48 x 2T switches 16 x 100GE 160 x 10GE B29 B30 B31 B32 5120 x 10GE • 48 switch non blocking 2 layer L2 fabric • 16 at “upper” layer A1..A16 • 32 at “lower” layer B1.. B32 • 16 uplinks per Bn, & 160 UNI links per Bn • 32 downlinks per An • (16 x 100GE per Bn)x32 = 512x100GE = 51.2T • 160 x 10GE server links (UNI) per Bn • (32 x 160)/2 = 2560 servers @ 2x10GE per • uFIB = 16 x 48 B-mac = 768 entries • mFIB = 16 subnet x 48 src = 768 entries 1536 FIB/node

  5. ECT-ALG#12SourceNode (1) S1 … S16 For a given ECT-ALGk, Aj is a member of every SPF-TREE(B*,ECT-ALGk) Properly tuned no two ECT-ALGorithms will use the same Aj as a fork point.

  6. Subnet Ni maps to I-SIDj and then to a unique A (j mod 16 ) A1 A2 A15 A16 B1 B2 B3 B4 B29 B30 B31 B32 I-SIDi I-SIDi I-SIDi I-SIDj I-SIDj I-SIDj So load spreading allows each Aito transit a complete subnet. Problem#1 - Unable to further spread such that Aiand Aj(i != j) each handle subset of flows in I-SID j

  7. This is an issue under failure of Aj A1 A2 A15 A16 B1 B2 B3 B4 B29 B30 B31 B32 I-SIDi I-SIDi I-SIDi I-SIDj I-SIDj I-SIDj Recovery will move entire subnet traffic to another Ai node. A preferable solution is to spread affected load over remaining A*

  8. Possible solution – head end hashing (unicast only) A1 A2 A15 A16 B1 B2 B3 B4 B29 B30 B31 B32 I-SIDi I-SIDi I-SIDi I-SIDj I-SIDj I-SIDj Allow unicast I-SIDi and I-SIDjtraffic to be hashed based on smaller flows to different B-VIDs (ECT-ALGorithms) This breaks the symmetry and congruence rules but allows edge balancing at smaller granularity. No changes to multicast.Requires learning <C-DA, B-DA> , independent of B-VID Unicast Mcast

  9. A1 A15 A2 A16 B1 B29 B2 B30 B31 B3 B4 B32 Interconnection of fabrics creates more than 16 paths (exponential ) O(16x2x16) C1 C2 O(16x2) A1 A2 A15 A16 O(16) B29 B30 B31 B32 B1 B2 B3 B4 Number of paths can grow exponentially with increasing levels. Constant number of paths always << number of paths in many networks. Growing 802.1aq ECT to say 32 or even 100 ECMP causes larger unicast FIBs.

  10. Horizontal Growth – not too bad but need more ECT-ALGORITHMS. A1 A2 A15 A16 A17 B33 B34 B29 B30 B31 B32 B1 B2 B3 B4 Horizontal growth by 1 just increases number of ECT by 1 Not too big a problem but we would need to define new ECT (via Opaque).

  11. Choosepath from N x B-VID General Issue O(degree) D S O(diameter) #paths ~= O( diameter degree) So head end ECT in worst case requires O(exp(# B-VIDs))

  12. A feasible solution … Single B-VID S D Choosepath from N x nxt hop Choosepath from N x nxt hop Re-assign traffic to path at each hop Tandem “ECMP” just like IP. Need to keep O(degree) number of next hops Only need one B-VID .. removes O(diameter) from state cost Flip side is you have no control – just hope for fine scale statistical distribution

  13. What about loops in this mode? 802.1aq Ingress Check is very strong in the case of a single next hop and hence a single possible ingress for an SA. 802.1aq Ingress Check is weakened in the case of a multiple next hop and hence Multiple possible ingress for an SA. However 802.1aq Agreement Protocol functions correctly in the context of multiple possible Next Hops for the same B-VID (refer to Mick’s proof). But …

  14. Agreement Protocol Concerns Is it too complex? it is clearly non trivial, we need implementation/emulation experience. Is it overly Draconian. For example the bounds on movement are what is required for a mathematical proof by induction .. However there are probably many cases where further movement would not loop. What isthe degree of ‘overkill’ ? Is it marketable? – this is unfortunately a legitimate concern!!! 802.1aq can be deployed without AP until we introduce hash basedforwarding at which point we either require a symmetric AP and/oran on-data-path loop detection/drop mechanism. Believe that an on-data-path loop detection mechanism is requiredfor hash based ECMP until we have more experience with AP. Recommend we standardize a TTL TAG either stand-alone or as a new form of I-TAG.

  15. View of New Work Requirements R1) New ECT-ALGorithms with improved spreading properties. R2) Allow optional head end hash assignment of 802.1aq SPBM UNI known unicasttraffic to one of multiple next hop interfaces/B-VIDs. Very similar to Link Ag.Minimally HASH (seed, C.SA, C.DA, C-VID, [ IP.SA, IP.DA, IP.PROTO] ) R3) Allow optional tandem hash assignment of 802.1aq SPBM B-VID NNI unicasttraffic to one of multiple next hop interfaces. Essentially a new SPBM ECT-ALGwith its own B-VID. (i.e. new ECT-ALGorithms, all usable at same time)Minimally HASH (seed, B-VID, C.SA, C.DA, C-VID, [ IP.SA, IP.DA, IP.PROTO ]) R4) minor OA&M changes in support of R2 and R3, because symmetry/congruence broken. R5) More experience with AP, emulations, simulations etc. +addition of TTL to new I-TAG or a TTL-TAG.

More Related