1 / 26

QBSS Applications

QBSS Applications. Les Cottrell – SLAC Presented at the Internet 2 Working Group on QBone Scavenger Service (QBSS), October 2001 www.slac.stanford.edu/grp/scs/talk/qbss-i2-oct01.ppt.

bertha
Download Presentation

QBSS Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. QBSS Applications Les Cottrell – SLAC Presented at the Internet 2 Working Group on QBone Scavenger Service (QBSS), October 2001 www.slac.stanford.edu/grp/scs/talk/qbss-i2-oct01.ppt Partially funded by DOE/MICS Field Work Proposal on Internet End-to-end Performance Monitoring (IEPM), also supported by IUPAP

  2. High Speed Bulk Throughput • Driven by: • Data intensive science, e.g. data grids • HENP data rates, e.g. • BaBar today have 500TBytes data, TB/day, by end of run in summer 2002 3TB/day, PB/yr, 40MB/s • Jlab similar, FNAL 2 similar experiments turning on • CERN/LHC 1000PBytes • Boeing 747 high throughput, BUT poor latency (~ 2 weeks) & very people intensive Data vol Moore’s law • So need high-speed networks and ability to utilize • High speed today = several hundred GBytes/day – TB/day (100GB/d ~ 10Mb/s) • Today’s networks have crossed the threshold where now possible to share data effectively via the network

  3. Throughput quality improvements TCPBW < MSS/(RTT*sqrt(loss)) 80% annual improvement ~ factor 10/3yr China Note E. Europe keeping up Macroscopic Behavior of the TCP Congestion Avoidance Algorithm, Matthis, Semke, Mahdavi, Ott, Computer Communication Review 27(3), July 1997

  4. Bandwidth changes with time 1/2 • Short term competing cross-traffic, other users, factors of 3-5 observed in 1 minute • Long term: link, route upgrades, factors 3-16 in 12 months All hosts had 100Mbps NICs. Recently have measured 105Mbps SLAC > IN2P3 and 340Mbps Caltech > SLAC with GE

  5. Typical results Today Hi-thru usually = big windows & multiple streams Improves ~ linearly with streams for small windows Broke 100Mbps Trans Atlantic Barrier Solaris Default window size 64kB 100kB 32kB 16kB 8kB

  6. Impact on Others • Make ping measurements with & without iperf loading • Loss loaded(unloaded) • RTT • Looking at how to avoid impact: e.g. QBSS/LBE, application pacing, control loop on stdev(RTT) reducing streams, want to avoid scheduling

  7. HENP Experiment Model • World wide collaborations necessary for large undertakings • Regional computer centers in France, Italy, UK & US • Spending Euros on data center at SLAC not attractive • Leverage local equipment & expertise • Resources available to all collaborators • Requirements - bulk: • Bulk data replication (current goal > 100MBytes/s) • Optimized cached read access to 10-100GB from 1PB data set • Requirements – interactive: • Remote login, video conferencing, document sharing, joint code development, co-laboratory (remote operations, reduced travel, more humane shifts) • Modest bandwidth – often < 1 Mbps • Emphasis on quality of service & sub-second responses

  8. Applications • Main network application focus today is on replication at multiple sites worldwide (mainly N. America, Europe and Japan) • Need fast, secure, easy to use, extendable way to copy data between sites • Need to interactive and real time at same time, e.g. experiment control, video & voice conferencing • HEP community has developed 2 major (freely available) applications to meet replication need: bbftp and bbcp

  9. Bbcp Data Data bbcp bbcp Source Sink bbcp Agent • Peer-to-peer copy program with multiple (<=64) streams, large window support, secure password exchange (ssh control path, single use passwords (data path)), similar syntax to scp • C++ component design allows testing new algorithms (relatively easy to extend) • Peer-to-peer • No server, if have program, have service (usually no need for admins), any node can act as source or sink, 3rd party copies • Provides sequential I/O (e.g. from /dev/zero, to pipe or tape or /dev/null) and progress reporting

  10. Application rate-limiting • Bbcp has transfer rate limiting • Could use network information (e.g. from Web100 or independent pinging) to bbcp to reduce/increase its transfer rate, or change number of parallel streams No rate limiting, 64KB window, 32 streams 15MB/s rate limiting, 64KB window, 32 streams

  11. QBSS test bed with Cisco 7200s Cisco 7200s • Set up QBSS testbed • Configure router interfaces • 3 traffic types: • QBSS, BE, Priority • Define policy, e.g. • QBSS > 1%, priority < 30% • Apply policy to router interface queues 10Mbps 100Mbps 100Mbps 1Gbps 100Mbps

  12. Example of effects Also tried: 1 stream for all, and priority at 30%

  13. QBSS with Cisco 6500 • 6500s + Policy Feature Card (PFC) • Routing by PFC2, policing on switch interfaces • 2 queues, 2 thresholds each • QBSS assigned to own queue with 5% bandwidth – guarantees QBSS gets something • BE & Priority traffic in 2nd queue with 95% bandwidth • Apply ACL to switch port to police Priority traffic to < 30% BE 100% Cisco 6500s + MSFC/Sup2 QBSS (~5%) Priority (30%) 100Mbps 1Gbps 1Gbps 1Gbps 1Gbps Time

  14. Impact on response time (RTT) • Run ping with Iperf loading with various QoS settings, iperf ~ 93Mbps • No iperf ping avg RTT ~ 300usec (regardless of QoS) • Iperf = QBSS, ping=BE or Priority: RTT~550usec • 70% greater than unloaded • Iperf=Ping QoS (exc. Priority) then RTT~5msec • > factor of 10 larger RTT than unloaded • If both ping & iperf have QoS=Priority then ping RTT very variable since iperf limited to 30% • RTT quick when iperf limited, long when iperf transmits

  15. Possible usage • Apply priority to lower volume interactive voice/video-conferencing and real time control • Apply QBSS to high volume data replication • Leave the rest as Best Effort • Since 40-65% of bytes to/from SLAC come from a single application, we have modified to enable setting of TOS bits • Need to identify bottlenecks and implement QBSS there • Bottlenecks tend to be at edges so hope to try with a few HEP sites

  16. SC2001 demo • Send data from SLAC/FNAL booth computers (emulate a tier 0 or 1 HENP site) to over 20 other sites with good connections in about 6 countries • Part of bandwidth challenge proposal • Saturate 2Gbps connection to floor network • Apply QBSS to some sites, priority to a few and rest Best Effort • See how QBSS works at high speeds

  17. More Information • IEPM/PingER home site: • www-iepm.slac.stanford.edu/ • Bulk throughput site: • www-iepm.slac.stanford.edu/monitoring/bulk/ • Transfer tools: • http://dast.nlanr.net/Projects/Iperf/release.html • http://doc.in2p3.fr/bbftp/ • www.slac.stanford.edu/~abh/bbcp/ • http://hepwww.rl.ac.uk/Adye/talks/010402-ftp/html/sld015.htm • TCP Tuning: • www.ncne.nlanr.net/training/presentations/tcp-tutorial.ppt • www-didc.lbl.gov/tcp-wan.html • QBSS measurements • www-iepm.slac.stanford.edu/monitoring/qbss/measure.html

  18. Extra slides with more detail

  19. Requirements • HENP formed a Trans-Atlantic Network committee charged to project requirements Does not include university or trans Pacific, or research needs

  20. bbftp • Implements an ftp-like user interface, with additions to allow large windows, multiple streams, and secure password exchange. • Has been in production use for more than a year • Is supported and being extended • http://doc.in2p3.fr/bbftp/

  21. Bbcp: algorithms • Data pipelining • Multiple streams “simultaneously” pushed • Automatically adapts to router traffic shaping • Can control maximum rate • Can write to tape, read from /dev/zero, write to /dev/null, pipe • Check-pointing (resume failed transmission) • Coordinated buffers • All buffers same-sized emd-to-end • Page aligned buffers • Allows direct I/O on many file-systems (e.g. Veritas)

  22. Bbcp: Security • Low cost, simple and effective security • Leveraging widely deployed infrastructure • If you can ssh there you can copy data • Sensitive data is encrypted • One time passwords and control information • Bulk data is not encrypted • Privacy sacrificed for speed • Minimal sharing of information • Source and Sink do not reveal environment

  23. Bbcp: user interface & features • Familiar syntax • bbcp [ options ] source [ source [ … ] ] target • Sources and target can be anything • [[username@]hostname:]]path • /dev/zero or /dev/null • Easy but powerful • Can gather data from multiple hosts • Many usability and performance options • Features: read from /dev/zero; write to: tape, /dev/null, pipe; check-pointing; MD5 checksums; compression; transfer rate limiting; progress reporting; mark QoS/TOS bits

  24. Impact of cross-traffic on Iperf between SLAC & NASA-GSFC Best throughput about 44Mbps Throughput varies by factor of 5 or more from weekday day to night Congested path

  25. Using bbcp to make QBSSmeasurements • Run bbcp src data /dev/zero, dst=/dev/null, report throughput at 1 second intervals • with TOS=32 (QBSS) • After 20 s. run bbcp with no TOS bits specified (BE) • After 20 s. run bbcp with TOS=40 (priority) • After 20 more secs turn off Priority • After 20 more secs turn off BE

  26. Optimizing streams • Choose # streams to optimize throughput/impact • Measure RTT from Web100 • App controls # streams

More Related