140 likes | 263 Views
LHC Community Network Performance Recommended BCP. Eric Boyd Deputy Technology Officer Internet2. Recap. At November, 2007 LHC OPN meeting, the group asked Internet2 and ESnet to work on a straw man “Best Practices Guide” for deploying perfSONAR. What.
E N D
LHC Community Network Performance Recommended BCP Eric Boyd Deputy Technology Officer Internet2
Recap • At November, 2007 LHC OPN meeting, the group asked Internet2 and ESnet to work on a straw man “Best Practices Guide” for deploying perfSONAR
What Straw Man Recommendation from US perfSONAR participants to US Atlas and US CMS Sites Working on a set of recommendations to help the US LHC community better react to network performance problems Plan to develop these recommendations with the Internet2 HENP-SIG, the US-Atlas, US-CMS community, participants from a BNL/FNAL sponsored workshop this spring, as well as anyone else interested in developing a best practices guide
Characterize and track network connectivity and performance to important peer sites Characterize and quantify network performance problems Differentiate between application and network performance problems Differentiate between local and remote network problems Identify, understand and respond effectively to changes in the underlying network Recommended Goals
Recommended Primary Use Cases End scientist attempting to determine why data transfers to her lab are not fast enough Site validating/debugging transfers to/from other sites Site validating/debugging transfers to/from end scientist
Recommended Approach: Network Performance Troubleshooting End-to-End network performance analysis TCP transfer throughput (reported by application/end-user) Identify where transfer is limited Application related problems Network end system problems (NDT) Network path problems (perfSONAR OWAMP, perfSONAR BWCTL) Network Performance Analysis Methodology Problem identification Step-by-step remediation of the detected problems Packet trace analysis as last resort
Recommended Infrastructure Tools and archives will be made available with the perfSONAR infrastructure New deployments will be found using the perfSONAR Lookup Service New tools can be integrated into the infrastructure at any time
Basic Strategy Each site (T0, T1, T2, …) acting independently: Exposes active measurement targets to support/control other sites tests to them Performs active tests to other participants Collects and exposes passive metrics (SNMP, sFlow, etc..) using pS archives Collects results from active tests and exposes metrics using pS archives Any participant: Can then use analysis tools to interact with any available archives to examine performance problems
Analysis of Strategy Success of strategy scales with the degree of participation (Metcalf’s Law) New tools and analysis can be phased into the infrastructure as they become available Analysis that is specific to this community can be integrated into the infrastructure
Site Participation Levels No Participation (Or Worse): Hostile: firewalls (blocked ICMP) Non-cooperative: no tools, no data Limited Partner: Willing target: daemons installed Active Partner: Participant: daemons installed, active testing to peers Data Provider: passive/active test results shared RECOMMENDED: Limited participation (T3s) or active participation (T1s and T2s)
Site Involvement Levels • Not interested • Hands-off • Delegate participation to a 3rd Party • Hands-on (any subset) • Manage hardware • Install software • Manage software • Manage data collection • Decide testing strategy • Decide data access policy
Target Options Knoppix install Tool installation owampd/bwctld Very limited configuration necessary, once tools are installed very little maintenance is required Active Partner Options Knoppix install Add perfSONAR Tool installation owampd/bwctld perfSONAR (CPAN install) More extensive configuration Identify important services to your site, monitor to those sites Site Deployment Options
Initial Useful Metrics and Tools Network Path characteristics • Round trip time (perfSONAR PingER) • Routers along the paths (traceroute) • Path utilization/capacity (perfSONAR SNMP-MA) • One way delay, delay variance (perfSONAR owamp) • One way packet drop rate (perfSONAR owamp) • Packets reordering (perfSONAR owamp) • Achievable throughput (perfSONAR bwctl)
Plan forward Specific analysis methodology will be developed with the community of users. (methods must match usage patterns) Specific metrics and tools will be recommended based on needs of methodology