1 / 68

SCinet Caltech-SLAC experiments

SC2002 Baltimore, Nov 2002. SCinet Caltech-SLAC experiments. Acknowledgments. netlab.caltech.edu/FAST. Prototype C. Jin, D. Wei Theory D. Choe (Postech/Caltech), J. Doyle, S. Low, F. Paganini (UCLA), J. Wang, Z. Wang (UCLA) Experiment/facilities

kuniko
Download Presentation

SCinet Caltech-SLAC experiments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SC2002 Baltimore, Nov 2002 SCinetCaltech-SLAC experiments Acknowledgments netlab.caltech.edu/FAST • Prototype • C. Jin, D. Wei • Theory • D. Choe (Postech/Caltech), J. Doyle, S. Low, F. Paganini (UCLA), J. Wang, Z. Wang (UCLA) • Experiment/facilities • Caltech: J. Bunn, C. Chapman, C. Hu (Williams/Caltech), H. Newman, J. Pool, S. Ravot (Caltech/CERN), S. Singh • CERN: O. Martin, P. Moroni • Cisco: B. Aiken, V. Doraiswami, R. Sepulveda, M. Turzanski, D. Walsten, S. Yip • DataTAG: E. Martelli, J. P. Martin-Flatin • Internet2: G. Almes, S. Corbato • Level(3): P. Fernes, R. Struble • SCinet: G. Goddard, J. Patton • SLAC: G. Buhrmaster, R. Les Cottrell, C. Logg, I. Mei, W. Matthews, R. Mount, J. Navratil, J. Williams • StarLight: T. deFanti, L. Winkler • Major sponsors • ARO, CACR, Cisco, DataTAG, DoE, Lee Center, NSF

  2. WAN in Lab Caltech research & production networks • Internet: distributed feedback control system • TCP: adapts sending rate to congestion • AQM: feeds back congestion information Rf (s) x y Chicago CERN TCP AQM q p Rb’(s) StarLight Calren2/Abilene Geneva Multi-Gbps 50-200ms delay SURFNet Theory Experiment Amsterdam Implementation equilibrium 155Mb/s 10Gb/s slow start FAST retransmit time out FAST recovery FAST Protocols for Ultrascale Networks People Faculty Doyle (CDS,EE,BE) Low (CS,EE) Newman (Physics) Paganini (UCLA) Staff/Postdoc Bunn (CACR) Jin (CS) Ravot (Physics) Singh (CACR) Students Choe (Postech/CIT) Hu (Williams) J. Wang (CDS) Z.Wang (UCLA) Wei (CS) Industry Doraiswami (Cisco) Yip (Cisco) Partners CERN, Internet2, CENIC, StarLight/UI, SLAC, AMPATH, Cisco netlab.caltech.edu/FAST

  3. Outline • Motivation • Theory • TCP/AQM • TCP/IP • Experimental results netlab.caltech.edu

  4. HEP high speed network … that must change netlab.caltech.edu

  5. NewYork ABILENE UK STARLIGHT SuperJANET4 ESNET NL GENEVA SURFnet Wave Triangle GEANT CALREN It STAR-TAP GARR-B Fr Renater HEP Network (DataTAG) • 2.5 Gbps Wavelength Triangle 2002 • 10 Gbps Triangle in 2003 Newman (Caltech) netlab.caltech.edu

  6. ’04 5 ’05 10 ’01 155 ’02 622 ’03 2.5 Network upgrade 2001-06 netlab.caltech.edu

  7. ’01 155 ’02 622 ’03 2.5 Projected performance ’04 5 ’05 10 Ns-2: capacity = 155Mbps, 622Mbps, 2.5Gbps, 5Gbps, 10Gbps 100 sources, 100 ms round trip propagation delay J. Wang (Caltech) netlab.caltech.edu

  8. Projected performance TCP/RED FAST Ns-2: capacity = 10Gbps 100 sources, 100 ms round trip propagation delay J. Wang (Caltech) netlab.caltech.edu

  9. Outline • Motivation • Theory • TCP/AQM • TCP/IP • Experimental results netlab.caltech.edu

  10. Congestion control Example congestion measure pl(t) • Loss (Reno) • Queueing delay (Vegas) pl(t) xi(t) netlab.caltech.edu

  11. pl(t) • AQM: • DropTail • RED • REM/PI • AVQ xi(t) TCP: • Reno • Vegas TCP/AQM • Congestion control is a distributed asynchronous algorithm to share bandwidth • It has two components • TCP: adapts sending rate (window) to congestion • AQM: adjusts & feeds back congestion information • They form a distributed feedback control system • Equilibrium & stability depends on both TCP and AQM • And on delay, capacity, routing, #connections netlab.caltech.edu

  12. x y Rf(s) F1 G1 Network AQM TCP FN GL q p Rb’(s) Network model netlab.caltech.edu

  13. x y Rf(s) F1 G1 Network AQM TCP FN GL q p Rb’(s) Vegas model netlab.caltech.edu

  14. Equilibrium • Performance • Throughput, loss, delay • Fairness • Utility Dynamics • Local stability • Cost of stabilization Methodology Protocol (Reno, Vegas, RED, REM/PI…) netlab.caltech.edu

  15. Primal-dual algorithm Reno, Vegas DropTail, RED, REM • Theorem(Low 00):(x*,p*) primal-dual optimal iff Summary: duality model • Flow control problem • TCP/AQM • Maximize utility with different utility functions netlab.caltech.edu

  16. Equilibrium of Vegas Network • Link queueing delays: pl • Queue length: clpl Sources • Throughput: xi • E2E queueing delay : qi • Packets buffered: • Utility funtion: Ui(x) = ai dilog x • Proportional fairness netlab.caltech.edu

  17. Persistent congestion • Vegas exploits buffer process to compute prices (queueing delays) • Persistent congestion due to • Coupling of buffer & price • Error in propagation delay estimation • Consequences • Excessive backlog • Unfairness to older sources Theorem(Low, Peterson, Wang ’02) A relative error of ei in propagation delay estimation distorts the utility function to netlab.caltech.edu

  18. Validation (L. Wang, Princeton) Source rates (pkts/ms) # src1 src2 src3 src4 src5 5.98 (6) 2.05 (2) 3.92 (4) 0.96 (0.94) 1.46 (1.49) 3.54 (3.57) 0.51 (0.50) 0.72 (0.73) 1.34 (1.35) 3.38 (3.39) 0.29 (0.29) 0.40 (0.40) 0.68 (0.67) 1.30 (1.30) 3.28 (3.34) # queue (pkts) baseRTT (ms) 19.8 (20) 10.18 (10.18) 59.0 (60) 13.36 (13.51) 127.3 (127) 20.17 (20.28) 237.5 (238) 31.50 (31.50) 416.3 (416) 49.86 (49.80) netlab.caltech.edu

  19. Equilibrium • Performance • Throughput, loss, delay • Fairness • Utility Dynamics • Local stability • Cost of stabilization Methodology Protocol (Reno, Vegas, RED, REM/PI…) netlab.caltech.edu

  20. TCP/RED stability • Small effect on queue • AIMD • Mice traffic • Heterogeneity • Big effect on queue • Stability! netlab.caltech.edu

  21. Stable: 20ms delay Window Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking netlab.caltech.edu

  22. Queue Stable: 20ms delay Window Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking netlab.caltech.edu

  23. Unstable: 200ms delay Window Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking netlab.caltech.edu

  24. Queue Unstable: 200ms delay Window Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking netlab.caltech.edu

  25. 30% noise avg delay 16ms 30% noise avg delay 208ms Other effects on queue 20ms 200ms netlab.caltech.edu

  26. x y Rf(s) F1 G1 Network AQM TCP GL FN TCP: • Small t • Small c • Large N RED: • Small r • Large delay q p Rb’(s) Stability: Reno/RED Theorem(Low et al, Infocom’02) Reno/RED is stable if netlab.caltech.edu

  27. x y Rf(s) F1 G1 Network AQM TCP FN GL q p Rb’(s) Stability: scalable control Theorem(Paganini, Doyle, Low, CDC’01) Provided R is full rank, feedback loop is locally stable for arbitrary delay, capacity, load and topology netlab.caltech.edu

  28. x y Rf(s) F1 G1 Network AQM TCP FN GL q p Rb’(s) Theorem(Choe & Low, Infocom’03) Provided R is full rank, feedback loop is locally stable if Stability: Vegas netlab.caltech.edu

  29. x y Rf(s) F1 G1 Network AQM TCP FN GL q p Rb’(s) Theorem(Choe & Low, Infocom’03) Provided R is full rank, feedback loop is locally stable if Stability: Stabilized Vegas netlab.caltech.edu

  30. x y Rf(s) F1 G1 Network AQM TCP FN GL q p Rb’(s) Stability: Stabilized Vegas Application • Stabilized TCP with current routers • Queueing delay as congestion measure has right scaling • Incremental deployment with ECN netlab.caltech.edu

  31. Fast AQM Scalable TCP • Equilibrium properties • Uses end-to-end delay and loss • Achieves any desired fairness, expressed by utility function • Very high utilization (99% in theory) • Stability properties • Stability for arbitrary delay, capacity, routing & load • Robust to heterogeneity, evolution, … • Good performance • Negligible queueing delay & loss (with ECN) • Fast response netlab.caltech.edu

  32. Implementation • Sender-side kernel modification • Build on • Reno, NewReno, SACK, Vegas • New insights • Difficulties due to • Effects ignored in theory • Large window size First demonstration in SuperComputing Conf, Nov 2002 • Developers: Cheng Jin & David Wei • FAST Team & Partners netlab.caltech.edu

  33. Outline • Motivation • Theory • TCP/AQM • TCP/IP • Experimental results • WAN in Lab netlab.caltech.edu

  34. Network (Sylvain Ravot, caltech/CERN) netlab.caltech.edu

  35. FAST BMPS 10 #flows 9 Geneva-Sunnyvale 7 FAST 2 Baltimore-Sunnyvale 1 Internet2 Land Speed Record 2 FAST • Standard MTU • Throughput averaged over > 1hr 1 netlab.caltech.edu

  36. FAST BMPS Mbps = 106 b/s; GB = 230 bytes netlab.caltech.edu

  37. Aggregate throughput 88% FAST • Standard MTU • Utilization averaged over > 1hr 90% 90% Average utilization 92% 95% 1.1hr 6hr 6hr 1hr 1hr 1 flow 2 flows 7 flows 9 flows 10 flows netlab.caltech.edu

  38. Rf (s) TCP x AQM p Rb’(s) Theory Experiment Internet: distributed feedback system Geneva 7000km Sunnyvale Baltimore 3000km 1000km Chicago SC2002 Baltimore, Nov 2002 SCinetCaltech-SLAC experiments Highlights • FAST TCP • Standard MTU • Peak window = 14,255 pkts • Throughput averaged over > 1hr • 925 Mbps single flow/GE card • 9.28 petabit-meter/sec • 1.89 times LSR • 8.6 Gbps with 10 flows • 34.0 petabit-meter/sec • 6.32 times LSR • 21TB in 6 hours with 10 flows • Implementation • Sender-side modification • Delay based 10 9 Geneva-Sunnyvale 7 #flows FAST 2 Baltimore-Sunnyvale 1 2 1 I2 LSR netlab.caltech.edu/FAST C. Jin, D. Wei, S. Low FAST Team and Partners

  39. FAST vs Linux TCP Mbps = 106 b/s; GB = 230 bytes; Delay = propagation delay Linux TCP expts: Jan 28-29, 2003 netlab.caltech.edu

  40. Aggregate throughput 92% FAST • Standard MTU • Utilization averaged over 1hr 2G 48% Average utilization 95% 1G 27% 16% 19% txq=100 txq=10000 Linux TCP Linux TCP FAST Linux TCP Linux TCP FAST netlab.caltech.edu

  41. Effect of MTU Linux TCP (Sylvain Ravot, Caltech/CERN) netlab.caltech.edu

  42. SC2002 Baltimore, Nov 2002 SCinetCaltech-SLAC experiments Acknowledgments netlab.caltech.edu/FAST • Prototype • C. Jin, D. Wei • Theory • D. Choe (Postech/Caltech), J. Doyle, S. Low, F. Paganini (UCLA), J. Wang, Z. Wang (UCLA) • Experiment/facilities • Caltech: J. Bunn, C. Chapman, C. Hu (Williams/Caltech), H. Newman, J. Pool, S. Ravot (Caltech/CERN), S. Singh • CERN: O. Martin, P. Moroni • Cisco: B. Aiken, V. Doraiswami, R. Sepulveda, M. Turzanski, D. Walsten, S. Yip • DataTAG: E. Martelli, J. P. Martin-Flatin • Internet2: G. Almes, S. Corbato • Level(3): P. Fernes, R. Struble • SCinet: G. Goddard, J. Patton • SLAC: G. Buhrmaster, R. Les Cottrell, C. Logg, I. Mei, W. Matthews, R. Mount, J. Navratil, J. Williams • StarLight: T. deFanti, L. Winkler • Major sponsors • ARO, CACR, Cisco, DataTAG, DoE, Lee Center, NSF

  43. FAST URL’s • FAST website http://netlab.caltech.edu/FAST/ • Cottrell’s SLAC website http://www-iepm.slac.stanford.edu /monitoring/bulk/fast netlab.caltech.edu

  44. Outline • Motivation • Theory • TCP/AQM • TCP/IP • Non-adaptive sources • Content distribution • Implementation • WAN in Lab netlab.caltech.edu

  45. OPM l1 l1 fiber spool S S R R l20 l20 S S S S R R R S S 500 km electronic crossconnect (Cisco 15454) H : server : router EDFA EDFA Max path length = 10,000 km Max one-way delay = 50ms

  46. l1 l1 R1 R2 l2 l2 l3 l3 l4 l19 l18 l19 R10 l20 l20 (a) Physical network Unique capabilities • WAN in Lab • Capacity: 2.5 – 10 Gbps • Delay: 0 – 100 ms round trip • Configurable & evolvable • Topology, rate, delays, routing • Always at cutting edge • Risky research • MPLS, AQM, routing, … • Integral part of R&A networks • Transition from theory, implementation, demonstration, deployment • Transition from lab to marketplace • Global resource netlab.caltech.edu

  47. R1 R2 R3 l1 l2 l20 l3 l19 l4 R10 (b) Logical network Unique capabilities • WAN in Lab • Capacity: 2.5 – 10 Gbps • Delay: 0 – 100 ms round trip • Configurable & evolvable • Topology, rate, delays, routing • Always at cutting edge • Risky research • MPLS, AQM, routing, … • Integral part of R&A networks • Transition from theory, implementation, demonstration, deployment • Transition from lab to marketplace • Global resource netlab.caltech.edu

  48. WAN in Lab Caltech research & production networks Chicago CERN StarLight Calren2/Abilene Geneva Multi-Gbps 50-200ms delay SURFNet Experiment Amsterdam Unique capabilities • WAN in Lab • Capacity: 2.5 – 10 Gbps • Delay: 0 – 100 ms round trip • Configurable & evolvable • Topology, rate, delays, routing • Always at cutting edge • Risky research • MPLS, AQM, routing, … • Integral part of R&A networks • Transition from theory, implementation, demonstration, deployment • Transition from lab to marketplace • Global resource netlab.caltech.edu

  49. Clear & present Need Resources Coming together … netlab.caltech.edu

  50. Coming together … Clear & present Need Resources netlab.caltech.edu

More Related