1 / 22

Networking

Networking. Shawn McKee University of Michigan DOE/NSF Review November 29, 2001. Why Networking?. Since the early 1980’s physicists have depended upon leading-edge networks to enable ever larger international collaborations.

Download Presentation

Networking

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Networking Shawn McKee University of Michigan DOE/NSF Review November 29, 2001

  2. Why Networking? • Since the early 1980’s physicists have depended upon leading-edge networks to enable ever larger international collaborations. • Major HEP collaborations, such as ATLAS, require rapid access to event samples from massive data stores, not all of which can be locally stored at each computational site. • Evolving integrated applications, i.e. Data Grids, rely on seamless, transparent operation of the underlying LANs and WANs. • Networks are among the most basic Grid building blocks. Shawn Mckee, UMich

  3. Tier2 Center Tier2 Center Tier2 Center Tier2 Center Tier2 Center HPSS HPSS HPSS HPSS Hierarchical Computing Model CERN/Outside Resource Ratio ~1:2Tier0/( Tier1)/( Tier2) ~1:1:1 ~PByte/sec ~100 MBytes/sec Online System Offline Farm,CERN Computer Ctr ~25 TIPS Tier0 +1 ~2.5 Gbits/sec HPSS Tier 1 France Italy UK BNL Center Tier 2 ~2.5 Gbps Tier 3 Physicists work on analysis “channels” Each institute has ~10 physicists working on one or more channels Institute ~0.25TIPS Institute Institute Institute 100 - 1000 Mbits/sec Physics data cache Tier 4 Workstations Shawn Mckee, UMich

  4. MONARC Simulations • MONARC (Models of Networked Analysis at Regional Centres) has simulated Tier 0/ Tier 1/Tier 2 data processing for ATLAS. • Networking implications: Tier 1 centers require ~ 140 Mbytes/sec to Tier 0 and ~200 Mbytes/sec to (each?) other Tier 1s, based upon 1/3 of ESD stored at each Tier 1. Shawn Mckee, UMich

  5. TCP WAN Performance Mathis, et. al., Computer Communications Review v27, 3, July 1997, demonstrated the dependence of bandwidth on network parameters: BW - Bandwidth MSS – Max. Segment Size RTT – Round Trip Time PkLoss – Packet loss rate If you want to get 90 Mbps via TCP/IP on a WAN link from LBL to IU you need a packet loss < 1.8e-6 !! (~70 ms RTT). Shawn Mckee, UMich

  6. Network Monitoring: Iperf (http://atgrid.physics.lsa.umich.edu/~cricket/cricket/grapher.cgi) • We have setup testbed network monitoring using Iperf (V1.2) (S. McKee(Umich), D. Yu (BNL)) • We test both UDP (90 Mbps sending) and TCP between all combinations of our 8 testbed sites. • Globus is used to initiate both the client and server Iperf processes. Shawn Mckee, UMich

  7. ESnet, Mren NPACI, Abilene Calren Esnet, Abilene, Nton ESnet Abilene USATLAS Grid Testbed U Michigan Boston University UC Berkeley LBNL-NERSC Argonne National Laboratory Brookhaven National Laboratory University of Oklahoma Prototype Tier 2s Indiana University University of Texas at Arlington HPSS sites Shawn Mckee, UMich

  8. Testbed Network Measurements Shawn Mckee, UMich

  9. Networking Requirements There is more than a simple requirement of adequate network bandwidth for USATLAS. We need: • A set of local, regional, national and international networks able to interoperate transparently, without bottlenecks. • Application software that works together with the network to provide high throughput and bandwidthmanagement. • A suite of high-level collaborative tools that will enable effective data analysis between internationally distributed collaborators. The ability of USATLAS to effectively participate at the LHC is closely tied to our underlying networking infrastructure! Shawn Mckee, UMich

  10. Networking as a Common Project • A new Internet2 working group has formed from the LHC Common Projects initiative: HENP (High Energy/Nuclear Physics), co-chaired by Harvey Newman (CMS) and Shawn McKee (ATLAS). • Initial meeting hosted by IU in June, kick-off meeting in Ann Arbor October 26th • The issues this group is focusing on are the same that USATLAS networking needs to address. • USATLAS gains the advantage of a greater resource pool dedicated to solving network problems, a “louder” voice in standard settings and a better chance to realize necessary networking changes. Shawn Mckee, UMich

  11. Network Coupling to Software • Our software and computing model will evolve as our network evolves…both are coupled. • Very different computing models result from different assumptions about the capabilities of the underlying network (Distributed vs Local). • We must be careful to keep our software “network aware” while we work to insure our networks will meet the needs of the computing model. Shawn Mckee, UMich

  12. Achieving High Performance Networking • Server and Client CPU, I/O and NIC throughput sufficient • Must consider firmware, hard disk interfaces, bus type/capacity • Knowledge base of hardware: performance, tuning issues, examples • TCP/IP stack configuration and tuning is Absolutely Required • Large windows, multiple streams • No Local infrastructure bottlenecks • Gigabit Ethernet “clear path” between selected host pairs • To 10 Gbps Ethernet by ~2003 • Careful Router/Switch configuration and monitoring • Enough router “Horsepower” (CPUs, Buffer Size, Backplane BW) • Packet Loss must be ~Zero (well below 0.1%) • i.e. No “Commodity” networks (need ESNet, I2 type networks) • End-to-end monitoring and tracking of performance Shawn Mckee, UMich

  13. Local Networking Infrastructure • LANs used to lead WANs in performance, capabilities and stability, but this is no longer true. • WANs are deploying 10 Gigabit technology compared with 1 Gigabit on leading edge LANs. • New protocols and services are appearing on backbones (Diffserv, IPV6, multicast) (ESNet, I2). • Insuring our ATLAS institutions have the required LOCAL level of networking infrastructure to effectively participate in ATLAS is a major challenge. Shawn Mckee, UMich

  14. Estimating Site Costs Network Planning for US ATLAS Tier 2 Facilities, R. Gardner, G. Bernbom(IU) Shawn Mckee, UMich

  15. Networking Plan of Attack • Refine our requirements for the network • Survey existing work and standards • Estimate likely developments in networking and their timescales • Focus on gaps between expectations and needs • Adapt existing work for US ATLAS • Provide clear, compelling cases to funding agencies about the critical importance of the network Shawn Mckee, UMich

  16. Survey of current/future network related efforts Determine and document US ATLAS network requirements Problem Isolation (Finger pointing tools) Protocols (Achieving high bandwidth and reliable connections) Network testbed (implementation, Grid testbed upgrades) Services (QoS, Multicast, Encryption, Security) Network configuration examples and recommendations End-to-end knowledgebase Monitoring for both prediction and fault detection Liaison to network related efforts and funding agencies Network Efforts Shawn Mckee, UMich

  17. Network Related FTEs/Costs Network related efforts to leverage and adapt existing efforts for ATLAS Shawn Mckee, UMich

  18. Support for Networking? • Traditionally, DOE and NSF have provided University networking support indirectly through the overhead charged to grant recipients. • National labs have network infrastructure provided by DOE, but not at the level we are finding we require. • Unlike networking, computing for HEP has never been considered as simply infrastructure. • The Grid is blurring the boundaries of computing and the network is taking on a much more significant, fundamental role in HEP computing. • It will be necessary for funding agencies to recognize the fundamental role the network plays in our computing model and to support it directly. Shawn Mckee, UMich

  19. What can we Conclude? • Networks will be vital to the success of our USATLAS efforts. • Network technologies and servicesare evolving requiring us to test and develop with current networks while planning for the future. • We must raise andmaintain awareness of networking issues for our collaborators, network providers and funding agencies. • We must clearly present network issues to the funding agencies to get the required support. • We need to determine what deficiencies exist in network infrastructure, services and support and work to insure those gaps are closed before they adversely impact our program. Shawn Mckee, UMich

  20. References • US ATLAS Facilities Plan • http://www.usatlas.bnl.gov/computing/mgmt/dit/ • MONARC • http://monarc.web.cern.ch/MONARC/ • HENP Working Group • http://www.usatlas.bnl.gov/computing/mgmt/lhccp/henpnet/ • Iperf monitoring page • http://atgrid.physics.lsa.umich.edu/~cricket/cricket/grapher.cgi Shawn Mckee, UMich

  21. Network FTE Breakdown Shawn Mckee, UMich

  22. Network K$ Breakdown Shawn Mckee, UMich

More Related