1 / 31

Networking Research Overview

This overview discusses the goals and thrusts of the SciDAC Networking Research Projects, which aim to develop data movement tools, advanced network tools, and cyber security tools to support real-time data-intensive applications in the Science Data Analysis Centers (SciDAC).

eithne
Download Presentation

Networking Research Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Networking Research Overview Micah Beck Assoc. Prof., Computer Science Director, LoCI Laboratory University of Tennessee SciDAC PI Mtg 24 March 2004

  2. SciDAC Networking Research Projects: Goals • Goal: Phase I • Develop data movement tools and infrastructures to support real-time data-intensive SciDAC applications • To develop advanced network tools enable SciDAC applications efficiently measure, predict, and diagnose end-to-end performance (2 projects) • To develop and deploy cyber security tools to support group collaborations in grid infrastructures • Goal: Phase II • Deploy the advanced tools developed in phase I in production infrastructures to support network intensive SciDAC projects

  3. Logistical Networking: Tools, Applications & Architecture Micah BeckJack DongarraJames S. Plank University of Tennessee Rich Wolksi University of California,Santa Barbara http://loci.cs.utk.edu/scidac

  4. Project Thrusts • Dongarra: Application Development Tools/Environments • NetSolve/GridSolve • Wolski: Network Monitoring/Prediction • Network Weather Service • Beck & Plank: Logistical Networking Infrastructure, Middleware & Support • Internet Backplane Protocol • Logistical Runtime System

  5. Internet Backplane Protocol • Overlay intermediate node providing services based on enriched resources • Storage: file system, RAM, disk • Transfer: TCP (std, compressed), UDP(SABUL, mcast), SAN/WAN • Processing: primitive operations (alpha) • 100s of IBP depots deployed worldwide • 1.4 alpha release: persistent sockets; optional authentication, usage logging

  6. Logistical Networking Tools • Logistical Runtime System (LoRS) • E2E Services: Fault tolerance (Reed-Solomon), encryption (AES), compression, high perf. data movement strategies • Library, command line, GUI, Web tools • Ported to all compute platforms (Cray OS problems) • Logistical Backbone (L-Bone) • depot monitoring, resource discovery • Logistical Distribution Network (LoDN) • directory services, content distribution • Java Web Start delivery of tools

  7. SciDAC Application Impact • Terascale Supernova Initiative(A. Mezzacappa, ONRL; J. Blondin, NCSU, D. Swesty, SUNY Stony Brook) • Five 1.6TB depots deployed at TSI sites • Energy Fusion Research(S. Klasky, PPPL) • Depots deployed on PPPL cluster nodes • Dataset transfers: O(1TB) @ 1-400 Mb/s • Simulations at NERSC and ORNL • Control/viz at ONRL, NCSU, Stony Brook, PPPL • Transfers span ESNet, Abilene • CS/Physics collaboration, science getting done!

  8. TSI Site Deployment: ORNL, NCSU, SUNY Stony Brook, NERSC, UCSD

  9. SciDAC Technology Impact • Spanning heterogeneous networks • Ultrascale (10 Gbps) wide area transfers require specialized systems • Optically swtiched networks (e.g DOE Science UltraNet) do not peer with IP • Serving scalable communities • Staging and caching at intermediate nodes • Processing data “in transit” • Common services ondistributed data

  10. Transit Networking Architecture Application Transport … Network IP common interface Transit link Local Physical transfer storage processing

  11. INCITE–Edge-based Traffic Processingfor High-Performance Networks R. Baraniuk, E. Knightly, R. Nowak, R. Riedi Rice University L. Cottrell, J. Navratil, W. Mathews SLAC W. Feng, M. Gardner LANL web site: incite.rice.edu

  12. INCITE Project • InterNet Control and Inference from The Edge on-line tools to characterize and map host and network performance as a function of time, space, application, protocol, andservice

  13. INCITE Thrusts and Tools Thrust 1:Multiscale traffic analysis and modeling techniques • wavelet, multifractal, connection-level models Thrust 2:Inference and control algorithms for network paths, links, and routers • end-to-end path probing and modeling • network tomography and topology discovery • advanced high-speed protocols Thrust 3:Data collection tools • active measurement infrastructure • passive application-layer measurement

  14. pathChirp • Goal • estimate instantaneous available bandwidth (ABW) on an end-to-end network link • Basic probing paradigm • stream packets at some rate • no queuing delay  rate<ABW • queuing delay builds up  rate>ABW • Until now: tradeoff • high accuracy has required high volume probing (inefficient) • Unique to pathChirp • variable rateprobe packet train (exponentially spaced chirp) • 10x more efficient than competing techniques

  15. Network Tomography From end-to-endmeasurements… … infer internal topology and delay/loss characteristics

  16. TCP alone 745.5 Kb/s TCP plus 739.5 Kb/sTCP-LP109.5 Kb/s TCP-LPis invisible to TCP TCP - Low Priority • Goal • utilize excess bandwidth in a non-intrusive fashion • Methodology • sender-side modification of TCP: delay-based approach • Applications • bulk data transfers • available bandwidth monitoring • P2P file sharing • High-speedTCP-LP • TCP-LP + HSTCP • implementation • Linux-2.4.22-web100 • experiments • Stanford - Ann Arbor • Stanford - Gainesville

  17. Changes in network topology (BGP) can result in dramatic changes in performance Hour Samples of traceroute trees generated from the table Los-Nettos (100Mbps) Remote host Snapshot of traceroute summary table Note: 1. Caltech misrouted via Los-Nettos 100Mbps commercial net 14:00-17:00 2. ESnet/GEANT working on routes from 2:00 to 14:00 Drop in performance (From original path: SLAC-CENIC-Caltech to SLAC-Esnet-LosNettos (100Mbps) -Caltech ) Back to original path Dynamic BW capacity (DBC) Changes detected by IEPM-Iperfand AbWE Mbits/s Available BW = (DBC-XT) Cross-traffic (XT) Esnet-LosNettos segment in the path (100 Mbits/s) ABwE measurement one/minute for 24 hours Thursday 9 October 9:00am to Friday 10 October 9:01am

  18. Crossing the Application/Network Divide Send data over network Application Segmentation TCP Flow & Congestion Control • Implications to the • application? • Insights for high- • performance network • protocols? Checksums IP Fragmentation : : Data Link Network monitors focus here. Network

  19. MUSE M A G N E T TICKET: tcpdump++ TICKET and MAGNET+MUSETICKET: Traffic Information-Collecting Kernel with Exact TimingMAGNeT: Monitor for Application-Generated Network TrafficMUSE: MAGNET User-Space Environment Send data over network Application Segmentation TCP Flow & Congestion Control Checksums IP Fragmentation : : Data Link Network For more information, go to www.lanl.gov/radiant/pubs.html

  20. MAGNeT  MAGNETMonitoring Apparatus for General kerNel-Event Tracing (at nanoscale granularity) • Why not extend monitoring to kernel events in general? Software Oscilloscope for Cluster and Grids • Debugging • e.g., IdentifiedLinux OS bug in the scheduler for SMPs. • Can be used to deploy, debug, and monitor the DOE UltraNet (UltraScienceNet), e.g., dynamic provisioning. • Performance Optimization • Improved performance of 10GigE adapters by 300%. Can improve end-to-end performance of DOE UltraNet. • Monitoring Grid Applications • Integrated MAGNET with SciDAC’s PERC TAU and SciDAC’s PERC SvPablo/Autopilot.* • Adaptive Resource-Aware Applications • SciDAC Deployment: PERC, Supernova Science Ctr, Transit Network Fabric + Terascale Supernova Initiative + Fusion Energy (emerging), and Earth Systems Grid II (emerging). * For more information, see M. Gardner, W. Deng, T. Markham, C. Mendes, W. Feng, and D. Reed, “A High-Fidelity Software Oscilloscope for Globus,” GlobusWorld 2004, Jan. 2004.

  21. Bandwidth estimation:measurement methodologies and applications k claffy (CAIDA), Constantinos Dovrolis (Georgia Tech)

  22. Project goals • Develop estimation techniques and public-domain tools for the estimation of end-to-end: • Network capacity (bottleneck bandwidth) • Available bandwidth (residual capacity) • Focus 1: non-intrusive, fast, and accurate techniques • Focus 2: high-bandwidth paths (up to 1Gbps) • Compare and validate different tools in reproducible and realistic net conditions • Apply bandwidth estimation in transport and overlay routing problems • Disseminate research results at conferences and journals

  23. Main accomplishments • Pathrate: capacity estimation tool • Based on packet pairs and trains • Publication: Transactions on Networking, to appear in 2004, and Infocom 2001 • Pathload: available bandwidth estimation tool • Based on self-loading periodic streams • Publications at ACM SIGCOMM02 and PAM 2002 • Both tools are available at: www.pathrate.org • About 200 downloads per month (and increasing) • Able to measure up to 1Gbps paths, even in the presence of interrupt coalescence • See publication at PAM 2004 • 1st Bandwidth Estimation workshopat CAIDA, Dec’03

  24. Main accomplishments (cont’) • Created testbed at CAIDA with several high-bw routers and switches and realistic cross traffic • Tested all existing open-source bandwidth estimation tools • Showed that, despite that several such tools exist, very few are accurate and consistent • Developed estimation technique for passive capacity estimation • See publications at IMC 2003 and PAM 2004 • Showed that per-hop capacity estimation tools (pathchar-like) are not accurate in the presence of layer-2 switches • See publication at Infocom 2003 • Created ANEMOS, a distributed system for automated on-line monitoring of many network paths • See publication at PAM 2003

  25. Ongoing work • Created SOBAS, an automatic socket buffer sizing technique based on available bandwidth estimation • Basic idea: limit TCP window based on available bandwidth before the connection causes losses • Does not require changes in TCP • Develop estimation technique for thevariation range of available bandwidth in different time scales • Variation range is crucial for some applications, including overlay routing • Evaluate thepredictability of available bandwidthprocess in Internet traffic • How far in the future can we predict the avail-bw with a given accuracy? • Use of bandwidth estimation inoverlay network routingand inUltraScienceNet dynamic optical circuit bandwidth provisioning

  26. Security and Policy forGroup Collaborationhttp://www.mcs.anl.gov/dsl/scidac/security/ • PIs: • Steven Tuecke (ANL) • Carl Kesselman (USC/ISI) • Miron Livny (U. Wisconsin) • Technologies involved: • Globus Toolkit • Condor

  27. Problem • Scalable, fine-grain policy management for large, dynamic collaborations: • Large number of individually managed resources, each with own policies • Large number of users • Users and resources in different domains • Community policies on use of resources

  28. Goals of this Project • Design, develop and standardize tools for maintaining structure of a collaboration • Take into account collaboration policy, user privileges, site policies, resource policies, etc. • Improve significantly the integration of local security environments • E.g., Kerberos • Instantiate our research results into a framework that makes it useable to a wide range of collaborative tools • Globus Toolkit, Condor • Work within standards community to socialize and standardize our approaches • GGF, IETF, OASIS

  29. Our Process Engage with communities Get feedback Design and develop solutions Integrate into community software Standardize solutions for greater acceptance Evaluate and guide emerging standards

  30. Delivered Solutions • Fine-grained Policy R&D: • Community Authorization Service • Dynamic Policy Reconciliation • Site Security Integration: • KCA/Kx509 • Authorization Callouts • Grid Security Usability: • SimpleCA /Online CA / MiniCA • Online Credential Repository

  31. Standards and Implementations • X.509 Proxy Certificates • GSSAPI extensions • Policy work: SAML, XACML • Policy Reconciliation CAS

More Related