Deploying Tight-SLA services on an IP Backbone

Deploying Tight-SLA services on an IP Backbone Clarence Filsfils – cf@cisco.com

Objective • To present design & deployment good practices to enable tight SLAs to be offered • when to use what and how • validation results • operational guidelines • deployment experience • Focus on the backbone design

An overview of the Analysis LLJ:Loss/Latency/Jitter Convergence DiffServ TE DSTE ISIS Sub-Second FRR Sub-100ms

Further information • “Engineering a Multiservice IP backbone to support tight SLAs”, Computer Networks Special Edition on the New Internet Architecture • Full-Day Tutorial • RIPE41, APRICOT 2002: www.ibb.net/~filsfils • Low-Level Design Guides, Validation Results

Agenda • Introduction and SLA • Sub-Second IGP Convergence • Backbone Diffserv Design • Conclusion

Typical Core Per Class SLA Characteristics Typically more Classes at the Edge

One-Way Jitter • Delay variation generally computed as the variation of the delay for two consecutive packets • Due to variation of • Propagation delay • Switching / processing delay • Queuing / scheduling delay • Jitters buffers remove variation but contribute to delay

Backbone VoIP Jitter Budget • Typical jitter budget: • Mouth to ear budget 100ms • Backbone propagation – 30ms • Codec delay – ~35ms • Jitter Budget = 35ms • 30ms for the access • 5ms for the core • 10 hops => 500 µs/hop

Per flow sequence preservation • Best-practise IP Design: per-flow loadbalacing! • Re-ordering Impact on Service Perception • Long-Lived TCP: degraded goodput • Real-time video: loss rate += OOS_rate • VoIP: jitter

Re-ordering Impact on Service • [LAOR01]: “Results show that packet reordering, by at leastthree packet locations, of only a small percentage of packets in the backbonelink can cause a significant degradation of applications throughput. Longflows are affected the most. Due to the potential effect, minimizing packetreordering, as well as mitigating its effect algorithmically, should be considered”.

Loss of Connectivity / Convergence • Incentive to reduce the loss of connectivity (LoC) • Availability • 99.999% per day  0.9sec of downtime • VoIP • 40msec LoC: glitch • 1, 2 sec LoC: call drop

How to specify the target for the metric • SLA statistical definitions do matter • min/avg/max versus percentile • Measured time interval… • SLAs definitions today tend to be loose • averaged over a month • averaged over many POP-to-POP pairs (temptation to add short pairs to reduce average…) • IP Performance Metrics IETF WG

Optimizing the IP Infrastructure • Loss, Latency, Jitter: iif Demand < Offer • OverProvisioned Backbone • Differentiated Services • Capacity Planning • TE and DS-TE • Loss of connectivity due to link/node failure • IGP Convergence • MPLS FRR Protection

Agenda • Introduction and SLA • Sub-Second IGP Convergence • Backbone Diffserv Design • Conclusion

Loss of Connectivity • IGP Backbone Convergence: • the time it takes for connectivity to be restored upon link/node failure/addition for an IP flow starting on an edge access router and ending on another edge access router, excluding any variation of BGP routes. • For this session, IGP = ISIS

Historical ISIS Convergence • 10 to 30 seconds • Not excellent • In the past, focus has been more on stability than on fast convergence • typical trade-off

What this presentation will explain • ISIS Convergence in 1 or 2 second is conservative

Link-State protocol overview 24

H G F C D E B A An example network 3 5 5 3 12 12 4 2 3 3 7 3 8 S2 4 S3 S1 3 S0

The Final SPT rooted at A G: oif so & s3, Cost 13 5 F: oif so & s3, Cost 8 2 C: oif so & s3, Cost 6 D: oif s3, Cost 3 E: oif so, Cost 11 3 3 3 8 S3 B: oif so, Cost 3 A: oif null, Cost 0 3 S0

G: oif so & s3, Cost 13 G 5 5 5 F: oif so & s3, Cost 8 F 12 12 4 2 2 D: oif s3, Cost 3 C: oif so & s3, Cost 6 C D E: oif so, Cost 11 3 3 E G: oif s3, Cost 13 3 3 7 3 5 3 8 8 S2 4 S3 S3 B B: oif so, Cost 3 S1 F: oif s3, Cost 8 3 A A: oif null, Cost 0 3 S0 S0 2 D: oif s3, Cost 3 C: oif s3, Cost 6 E: oif s1 & s3, Cost 12 3 3 8 S3 B: oif s1, Cost 4 A: oif null, Cost 0 4 S1

The RIB construction Lo0: 1.1.1.1/32, C=0 Pos1: 2.0.0.1/30, C=2 • ISIS adds the following paths to the RIB: • 1.1.1.1/32: OIF = S0 or S3 with Metric 6 (6+0) • 2.0.0.1/30: OIF = S0 or S3 with Metric 8 (6+2) C: oif so & s3, Cost 6 D: oif s3, Cost 3 3 3 3 S3 B: oif so, Cost 3 A: oif null, Cost 0 3 S0

LSDB, RIB and FIB sh isis data Static Routes ISIS LSDB BGP table Best RIB sh ip route Control Data Plane FIB & dFIB sh ip cef

SPF optimisations 30

SPF Optimizations • Most Basic Implementation • Any change (link, node, leave)  recompute the whole SPT and the whole RIB • Optimization 1: decouple SPT and RIB • If any topology change (node, link)  recompute SPT and the RIB • If only a leave change (IP prefix)  keep the SPT, just update the RIB for the nodes whose leaves have changed Called “SPF” Called “PRC”

Int lo 0: 65.1.1.1/32 D A B Cost: 3, NH: D Cost: 0, NH: -- Cost: 3, NH: B G E Cost: 11, NH: B Cost: 13, NH: D F C Cost: 8, NH: D, B Cost: 6, NH: D, B PRC • PRC here consists in just adding 65.1.1.1/32 in the RIB. The SPT is not affected. S2 S3 S1 S0

Incremental-SPF • Optimization 2 • When the topology has changed, instead of building the whole SPT from scratch just fix the part of the SPT that is affected • Only the leaves of the nodes re-analyzed during that process are updated in the RIB

A B D Cost: 3, NH: D Cost: 0, NH: -- Cost: 3, NH: B G E Cost: 13, NH: D Cost: 11, NH: B C-G link is down. C-G link was not used in SPT anyway, therefore there is no need to run SPF. F C Cost: 8, NH: D, B Cost: 6, NH: D, B Incremental-SPF S2 S3 S1 S0

H D B A Cost: 3, NH: D Cost: 0, NH: -- Cost: 3, NH: B G E Cost: 13, NH: D Cost: 11, NH: B F C Cost: 6, NH: D, B Cost: 8, NH: D, B Incremental-SPF F reports a new neighbor. The SPT need only to be extended behind F. There is no need for router A to recompute the whole SPT Router A will compute SPF from node F S2 S3 S1 S0

Incremental-SPF • More information is kept in the SPT • Parents list • Neighbors list • Based on the changed information, the SPT is “modified” in order to reflect the changes

Incremental-SPF • The further away from the root the change, the higher the gain

SPF, PRC, I-SPF: summary • Only a leaf change • PRC • Graph impacted • normal-SPF: recompute the full SPT and hence reinserts all the ISIS routes in the RIB • I-SPF: only recomputes the part of the SPT that is affected. Only the leaves from that part are affected.

Topology and Leaf Optimizations 39

C D E LSP A IS: 3 B IS: 4 B IS: 7 C IS: 3 D B LSP B IS: 3 A IS: 4 A IS: 3 C IS: 8 E A Parallel point-to-point adjacencies 3 • Only best parallel adjacency is reported 3 7 3 8 S2 4 S3 S1 3 S0

interface fastethernet1/0 • isis network point-to-point Rtr-B Rtr-B Rtr-B Pseudonode Rtr-A Rtr-A Rtr-A P2P mode for back-to-back GE • No DIS election • No CSNP transmission • No Pseudo-node and extra link

Speeding up route installation • Limit the # of leaves in the IGP • only the BGP speakers are needed ( ) • rest: I-BGP • router isis • advertise passive-only

SPF, PRC and LSP-genExponential BackOff Timers 43

Backoff timer algorithm • IS-IS throttles it main events • SPF computation • PRC computation • LSP generation • Throttling slows down convergence • Not throttling can cause melt-downs • The scope is to react fast to the first events but,under constant churn, slow down to avoid to collapse

Backoff timer algorithm • spf-interval <Max> [<Init> <Inc>] • Maximum interval: Maximum amount of time the router will wait between consecutives executions • Initial delay: Time the router will wait before starting execution • Incremental interval: Time the router will wait between consecutive execution. This timer is variable and will increase until it reaches Maximum-interval

E1Event1 E2 E3 E4 E5 E6 E7 SPF SPF SPF 100ms 1000ms 2000ms 4000ms spf-interval 10 100 1000 • Then 8000ms • Then maxed at 10sec • 20s without Trigger is required before resetting the SPF timer to 100ms

Default Values • Incremental-interval: • SPF: 5.5 seconds • PRC: 5 seconds • LSP-Generation: 5 seconds • Maximum-interval: • SPF: 10 seconds • PRC: 5 seconds • LSP-Generation: 5 seconds • Initial-wait: • SPF: 5.5 seconds • PRC: 2 seconds • LSP-Generation: 50 milliseconds

LSP LSP E F B Two-Way Connectivity Check • For propagating Bad News, 1! LSP is enough

Timers for Fast Convergence router isis spf-interval 1 1 50 prc-interval 1 1 50 • Init Wait: 1ms • 5.5 sec faster than default reaction! • Optimized for the going down mode • Exp Increment ~ S ms • Max Wait ~ n * S ms • CPU utilization < 1/n

Timer for Fast Convergence router isis lsp-gen-interval 5 1 50 • The timers are designed to optimize the propagation of the information to other nodes. • Init-Wait = 1ms, 49ms faster than default • Exp-Inc = S, eg. 50ms

LSP Pacing and Flooding 53

LSP Pacing and Flooding Int pos x/xisis lsp-interval <> • Pacing: • Default: 33msecs inter-LSP gap • backoff protection • full database download • suggest to keep the default • Flooding • flood/SPF trade-off

Link Protocol Properties 55

Link Protocol Properties • Link Failure Detection • the faster and more reliable, the better • Dampening flapping links • Fast signalling of a Down information • Stable signalling of an UP information • Freeze a flapping link in Down status

POS – Detection of a link failure • Pos delay trigger line: • hold time before reacting to a line alarm • default is: immediate reaction • Pos delay trigger path: • hold time before reacting to a path alarm • default is: no reaction • Carrier-delay • hold time between the end of the pos delay holdtime and the bring down of the IOS interface • default: 2000 msec

POS – Detection of a link failure int pos 1/0 carrier-delay msec 8 • Redundant for POS interfaces

Deploying Tight-SLA services on an IP Backbone