1 / 39

Backward Congestion Notification Version 2.0

Backward Congestion Notification Version 2.0. Davide Bergamasco ( davide@cisco.com ) Rong Pan (ropan@cisco.com) Cisco Systems, Inc. IEEE 802.1 Interim Meeting Garden Grove, CA (USA) September 22, 2005. Credits. Valentina Alaria (Cisco) Andrea Baldini (Cisco) Flavio Bonomi (Cisco)

aiko
Download Presentation

Backward Congestion Notification Version 2.0

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Backward Congestion Notification Version 2.0 Davide Bergamasco (davide@cisco.com) Rong Pan (ropan@cisco.com) Cisco Systems, Inc. IEEE 802.1 Interim Meeting Garden Grove, CA (USA) September 22, 2005

  2. Credits • Valentina Alaria (Cisco) • Andrea Baldini (Cisco) • Flavio Bonomi (Cisco) • Manoj K. Wadekar (Intel)

  3. BCN v2.0 • Desire from Mick to see an analytical studyof BCN stability • BCN v2.0 improvements • Linear control loop allows analysis of stability • Simplified detection mechanism • Reduced signaling rate • Original BCN framework remains the same

  4. BCN Background

  5. Detection & Signaling

  6. Reaction

  7. Suggested BCN Message Format 0 15 31 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + DA = SA of sampled frame +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ SA = MAC Address of CP + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IEEE 802.1Q Tag or S-Tag | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | EtherType = BCN |Version| Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + CPID + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Qoff | Qdelta | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Timestamp | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | | First N bytes of sampled frame starting from DA | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | FCS | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

  8. Suggested RLT Tag Format 0 3 7 15 31 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + DA of rate-limited frame +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ SA of rate-limited frame + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IEEE 802.1Q Tag or S-Tag of rate-limited frame | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | EtherType = RLT |Version| Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + CPID + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Timestamp |EtherType of rate limited frame| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + Payload of rate-limited frame + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | FCS | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

  9. ES6 Core Switch SJ DR2 ES1 ES2 ES3 ES4 ES5 SR2 SR1 ST1 SU1 ST2 SU2 ST3 SU3 ST4 SU4 DT DU DR1 Simulation Environment (1) TCP Bulk UDP On/Off Congestion

  10. Simulation Environment (2) • Short Range, High Speed DC Network • Link Capacity = 10 Gbps • Switch latency = 1 s • Link Length = 100 m (0.5  s propagation delay) • Control loop • Delay ~ 3 s • Parameters • W = 2 • Gi = 4 • Gd = 1/64 • Ru = 8 Mbps • Workload • ST1-ST4: 10 parallel TCP connections transferring 1 MB each continuously • SU1-SU4: 64 KB bursts of UDP traffic starting at t = 10 ms

  11. BCNv1.0

  12. BCNv2.0 Faster Transient Response Higher Stability @ Steady State

  13. Simulation Environment (3) • Long Range, High Speed DC Network • Link Capacity = 10 Gbps • Switch latency = 1 s • Link Length = 20000 m (100  s propagation delay) • Control loop • Delay ~ 200 s • Parameters • W = 2 • Gi = 4 • Gd = 1/64 • Ru = 8 Mbps • Workload • ST1-ST4: 10 parallel TCP connections transferring 1 MB each continuously • SU1-SU4: 64 KB bursts of UDP traffic starting at t = 10 ms

  14. BCNv1.0

  15. BCNv2.0 Much higher stability @ steady state with larger loop delays

  16. Summary • BCN v2 has a number of advantages … • Can be studied analytically • Better protection of TCP flows in mixed TCP and UDP traffic scenarios • Detection algorithm independent of Switch implementation • Better Performance • Lower signaling frequency (from 10% to 1%) • Better stability • Increased tolerance to loop delays • … and one disadvantage • Slower convergence to fairness

  17. A Control-Theoretic Approach to BCNDesign and Analysis

  18. Notation N: Number of Flows C: Link Capacity : Round Trip Delay w: Weight of the Derivitive Pm: Sampling Probability Gi: Additive Increase Gain Gd: Multiplicative Decrease Gain

  19. Block Diagram of BCN Congestion Control C + + ∆R R q _ + + Gd _ Time Delay + Pm Gi

  20. Non-linear Differential Equations Link Control Source Control If Fb(t-) > 0 If Fb(t-) < 0

  21. Linearization Around Operating Point • Using feedback control to analyze local stability • Operating point: • R = C/N; • q’ = qeq – q = 0; • Linearization • Difficulty: depending on sgn(Fb(t-d)), the system responses are different • Luckily, a piecewise-linear function • Details are in the appendix

  22. add lead zero to compensate Block Diagram of BCN Feedback Control + R q + lose 90o margin Multiplicative Decrease: _ Fb Additive Increase: +

  23. zero:dq/dt The Effect Of Zero From Time Domain’s Eyes R q

  24. Choosing Parameters – an example • Network conditions (10G link) • N = 50 •  = 200us • Choose parameters such that the feedback loop is stable with a 35o margin • w = 4 • Gi = 2Mbps • Gd = 1/128 • Pm = 0.01

  25. With N = 50, delay = 200us, the system is stable • Phase margin translates into allowing extreme network conditions of N -> 1000 flows or  -> 1ms before oscillation Stability Result: lost 90o margin

  26. Simulation Result Shows A Stable System for N = 50; Delay = 200us

  27. Simulation Result Shows System is stable, but on the verge of oscillation: N = 50, Delay = 1ms

  28. When w = 1, a system with N = 50, delay = 200us already runs out of margin, on the verge of oscillation • w = 1, diminishing zero effect. System can’t cope with wide range of network conditions Change W = 4 -> 1

  29. Indeed System is stable, but on the verge of oscillation even for N = 50, Delay = 200us when w = 1.0

  30. Requests to 802.1 • Start a Task Force on Congestion Management • Use BCN as a Baseline Proposal

  31. Appendix

  32. Linearizing…

  33. Linearizing Additive Increase Function

  34. Linearizing Additive Increase Function

  35. Linearizing Multiplicative Decrease Function

  36. Linearizing Multiplicative Decrease Function

  37. Issue #1: Non-linearity Q • ISSUE: Overshoots and undershoots accumulate over time • SOLUTION: Signal only when • Q > Qeq && dQ/dt > 0 • Q < Qeq && dQ/dt < 0 • Easy to implement in hardware: just an Up/Down counter • Increment @ every enqueue • Decrement @ every dequeue • Reduces signaling rate by 50%!! Stop Generation of BCN Messages + - - + + - - + Qeq t

  38. Issue #2: Specific Detection Mechanism

  39. 39 39 39

More Related