210 likes | 328 Views
TRILL Routing Scalability Considerations. Alex Zinin zinin@psg.com. General scalability framework. About growth functions for Data overhead (Adj’s, LSDB, MAC entries) BW overhead (Hellos, Updates, Refr’s/sec) CPU overhead (comp complexity, frequency) Scaling parameters
E N D
TRILL Routing Scalability Considerations Alex Zinin zinin@psg.com TRILL BOF
General scalability framework • About growth functions for • Data overhead (Adj’s, LSDB, MAC entries) • BW overhead (Hellos, Updates, Refr’s/sec) • CPU overhead (comp complexity, frequency) • Scaling parameters • N—total number of stations N • L—number of VLANs • F—relocation frequency • Types of devices • Edge switch (attached to a fraction of N, and L) • Core switch (most of L) TRILL BOF
Scenarios for analysis • Single stationary bcast domain • No practical station mobility • N = O(1K) by natural bcast limits • Bcast domain with mobile stations • Multiple stationary VLANs • L = O(1K) total, O(100) visible to switch • N = O(10K) total • Multiple VLANs with mobile stations TRILL BOF
Protocol params of interest • What • Amount of data (topology, leaf entries) • Number of LSPs • LSP refresh rate • LSP update rate • Flooding complexity • Route calculation complexity & frequency • Why • Required memory [increase] as network grows • Required mem & CPU to keep up with protocol dynamics • Link BW overhead to control the network • How: • Absolute: big-O notation • Relative: compare to e.g. bridging & IP routing TRILL BOF
Why is this important • If data-inefficient: • Increased memory requirements • Frequent memory upgrades as network grows • Much more info to flood • If comput’ly inefficient: • Substantial comp power increase == marginal network size increase • High CPU utilization • Inability to keep up with protocol dynamics TRILL BOF
Link-state Protocol Dynamics • Network events are visible everywhere • Main assumption for stationary networks: • Network change is temporary • Topology stabilizes within finite T • For each node: • Rinp—input update rate (network event frequency) • Rprc—update process rate • Long-term convergence condition: • Rprc >> Rinp • What if (Rprc < Rinp) ??? • Micro bursts are buffered by queues • Short-term (normal for stat. nets): update drops, rexmit, convergence • Long-term/permanent: net never converges, CPU upgrade needed • Rprc = f (proto design, CPU, implementation) • Rinp = f (proto design, network) TRILL BOF
Data-plane parameters • Data overhead • Number of MAC entries in CAM-table • Why worry? • CAM-table is expensive • 1-8K entries for small switches • 32K-128K for core switches • Shared among VLANs • Entries expire when stations go silent TRILL BOF
Single Bcast domain (CP) • Total of O(1K) MAC addresses • Each address: 12bit VLAN tag + 48bit MAC = 60 bits • IS-IS update packing: • 4 addr’s per TLV (TLV is 255B max) • 20 addr’s per LSP fragment (1470B default) • ~5K add’s per node (256 frags total) • LSP refresh rate: • 1K MACs = 50 LSPs • 1h renewal = 1 update every 72 secs • MAC update rate: • Depends on MAC learning & dead detection procedure TRILL BOF
MAC learning • Traffic + expiration (5-15m): • Announces station activity • 1K stations, 30m fluctuations = 1 update every 1.8 seconds average • Likely bursts due to “start-of-day” phenomenon • Reachability-based • Start announcing MAC when first heard from station • Assume it’s there until have seen evidence otherwise even if silent (presumption of reachability) • Removes activity-sensitive fluctuations TRILL BOF
Single bcast domain (DP) • Number of entries • Bridges: f (traffic) • Limited by local config, location within network • Rbridge: all attached stations • No big change for core switches (see most MACs) • May be a problem for smaller ones TRILL BOF
Single bcast: summary • With reachibility-based MAC announcements… • CP is well within the limits of current link-state routing protocols • Can comfortably handle O(10k) routes • Dynamics are very similar • There’s an existence proof that this works • CP data overhead is O(N) • Worse than IP routing: O(log N) • However, net size is upper-bound by bcast limits • Small switches will need to store & compute more • Data-plane may require bigger MAC tables in smaller switches TRILL BOF
Note: comfort limit • Always possible to overload neighbor w updates • Update flow control is employed • Dynamic is possible, yet… • Experience-based heuristics: pace updates at 30/sec • Not a hard rule, ballpark • Limits burst Rinp for neighbor • Prevents drops during flooding storms • Given the (Rprc >> Rinp) condition, want average to be an order of magnitude lower, e.g. O(1) upd/sec Max TRILL BOF
Note: protocol upper-bound • LSP generation is paced: normally not more frequent than each 5 secs • Each LSP frag has it’s own timer • With equal distribution • Max node origination rate == 51 upd/sec • Does not address long-term stability TRILL BOF
Single bcast + mobility • Same number of stations • Same data efficiency for CP and DP • Different dynamics • Take IETF wireless network, worst case • ~700 stations • New location within 10 minutes • Average 1 MAC every 0.86 sec or 1.16 MAC/sec • Note: every small switch in VLAN will see updates • How does it work now??? • Bridges (APs + switches) relearn MACs, expire old • Summary: dynamics barely fit within comfort range TRILL BOF
Multiple VLANs • Real networks have VLANs • Assuming current proposal is used • Standard IS-IS flooding • Two possibilities: • Single IS-IS instance for whole network • Separate IS-IS instance per VLAN • Similar scaling challenges as with VR-based L3 VPNs TRILL BOF
VLANs: single IS-IS • Assuming reachability-based MAC announc’t • Adjacencies and convergence scale well • However… • Easily hit 5K MAC/node limit (solvable) • Every switch sees every MAC in every VLAN • Even if it doesn’t need it • Clear scaling issue TRILL BOF
VLANs: multiple instances • MAC announcements scale well • Good resource separation • However… • N adjacencies for a VLAN trunk • N times more processing for a single topological event • N times more data structures (neighbors, timers, etc.) • N =100…1000 for a core switch • Clear scaling issue for core switches TRILL BOF
VLANs: data plane • Core switches • Not big difference • Exposed to most MACs in VLANs anyway • Smaller switches • Have to install all MACs even if a single port on a switch belongs to a VLAN • May require bigger MAC tables than available today TRILL BOF
VLANs: summary • Control plane: • Currently available solutions have scaling issues • Data plane: • Smaller switches may have to pay TRILL BOF
VLANs + Mobility • Assuming some VLANs will have mobile stations • Data plane: same as stationary VLANs • All scaling considerations for VLANs apply • Mobility dynamics get multiplied • Single IS-IS: updates hit same adjacency • Multiple IS-IS: updates hit same CPU • Activity not bounded naturally anymore • Update rate easily goes outside comfort range • Clear scaling issues TRILL BOF
Resolving scaling concerns • 5K MAC/node limit in IS-IS could be solved with RFC3786 • Don’t use per-VLAN (multi-instance) routing • Use reachability-based MAC announcement • Scaling MAC distribution requires VLAN-aware flooding: • Each node and link is associated with a set of VLANs • Only information needed by the remote nbr is flooded to it • Not present in current IS-IS framework • Forget about mobility ;-) TRILL BOF