320 likes | 553 Views
IEPM. Presented by Warren Matthews (SLAC) At the Internet2 Members Meeting October 2002. Overview. Overview at BOF in Boulder More details Technical aspects of IEPM-BW Some Results, Trouble Shooting procedure Application to PIPES Data Grids and Web Services MAGGIE and PIPES. IEPM-BW.
E N D
IEPM Presented by Warren Matthews (SLAC) At the Internet2 Members Meeting October 2002
Overview • Overview at BOF in Boulder • More details • Technical aspects of IEPM-BW • Some Results, Trouble Shooting procedure • Application to PIPES • Data Grids and Web Services • MAGGIE and PIPES
IEPM-BW • Continuously operating high performance throughput measurement project. • Active: iperf, bbcp, bbftp, udpmon … • Passive: Netflow • Web100 • Track, compare, contrast
Requirements for Hosts • ≥500MHz CPU • NIC card ≥ 100Mbps • full duplex • ≥ OC3, then a GE interface is preferred • Linux (Red Hat) or Solaris • Also several perl modules and other packages
Non-Requirements for Hosts • We don’t require a certain network card • See talk at Boulder, and others • We also don’t require specific disk • But known performance implications
Requirements for Firewalls • TCP ports • 22 (for ssh) • 5000-5011 (for iperf) • 5020-5022 (for bbftp) • 5031-5039 (for bbcp) • 2811 (for GridFTP) Iperf ports have varying maximum receive window
Additional Requirements for Monitoring Hosts • The host must be configured for large TCP buffers/windows • Monitoring Installation Kit • Fill out form, configuration is configured
IEPM-BW Internals • Detailed configuration file • Remote username, parameters • Cron runs each test every 90±7 minutes • Remote daemon brought up only to do test • Ssh Authentication • Data written to local disk for near-real time analysis (web accessible)
IEPM-BW Sites • 8 monitoring nodes • SLAC, FNAL • NIKHEF (Amsterdam) • APAN (Japan) • INFN-Milan (Italy) • U of Michigan, Internet2 • U of Manchester, UCL (UK) • ~30 target nodes
KEK LANL JLAB EDG CERN NIKHEF TRIUMF FNAL IN2P3 PPDG/GriPhyN NERSC ANL CHI CERN SNV RAL NY UManc SLAC UCL ESnet SLAC DL BNL JAnet NNW APAN Stanford RIKEN INFN-Roma APAN Geant INFN-Milan CalREN Abilene SEA NY CESnet ATL SNV HSTN CLV SOX IPLS CALTECH Monitoring Site SDSC UTDallas UMich I2 UFL Rice NCSA
Summary of IEPM-BW • IEPM-BW is a measurement tools and an analysis framework • Analysis • Resource-Brokering and Trouble-Shooting • Feedback results • Applications as well as users (Web Services) • Work with these and others to create co-operative framework (GGF)
Application to PIPES • IEPM-BW is not just for continuous measurements • PIPES challenge is to trouble shoot the network • And point the finger ! • Activity on the mailing list regarding methodology
Performance Between SLAC and IN2P3 ESnet SNV CHI IN2P3 SLAC Star Tap CERN
Rule out something specific to this tool
Recall IN2P3 is routed via CERN
ANL Recall route to IN2P3 and CERN is common with route to FNAL and ANL as far as Chicago FNAL
CERN (Similar to IN2P3) CERN2
Summary of Trouble Shooting • Send email to CERN • Most crimes are solved by informers • Not attempting to completely replace human experience with an algorithm • Even automating alerts is hard
Data Grids • Future of HENP is the Grid • Groups such as the particle physics data grid (PPDG) • Trouble-Shooting Production Grids • Publish Data • MDS • Web Services
Web Services • Remove the human sitting at a PC • Application-to-application (peer-to-peer) • XML over SOAP messages • Simple Object Access Protocol • WSDL, UDDI • Web Services Description Language • Universal Description, Discovery and Integration
IEPM Web Services • Provide network performance data for resource brokers (PingER, IEPM-BW) • Provide prediction of network performance for resource brokers (NWS) • Provide Node information (demo) • Set TCP/IP parameters for real transfers (demo)
Example: File Transfer • Web service client wrapped around bbcp • User does not know parameters, but monitoring does • Catch is node must be known to monitoring ./bbcp-wrapper -G node2.ccs.ornl.gov Expected Throughput = 93209.34728 kbps ./bbcp-wrapper -A /u7/iepm-bw/temp/dummy.1024000 node2.ccs.ornl.gov bbcp: ANTONIA.SLAC.Stanford.EDU running version 02.05.22.01.0 bbcp: firebird.ccs.ornl.gov running version 02.05.22.00.0 bbcp: Creating /home/cottrell/pinger/bbcpdat/dummy.1024000 File /home/cottrell/pinger/bbcpdat/dummy.1024000 created; 1024000 bytes at 1543.2 KB/s 1 file copied at effectively 406.3 KB/s
Further Work: MAGGIE • Measurement and Analysis for the Global Grid and Internet End-to-end Performance. • Big sister of Active Internet Monitoring for ESnet (AIME). • 10-20+ NIMI hosts running periodic and on-demand measurements. • Feedback results to instrument tools and troubleshoot issues.
IEPM-BW F.P.T. Tools Analysis MDS Web Services Publishing Local Data Cache IPERF BBCP BBFTP Test Engine NIMI CRON Scheduling SSH NIMI GSI Security
Summary • IEPM-BW is active and gathering data • More tools under development • Develop MAGGIE with PIPES in mind
More Information • See • http://www-iepm.slac.stanford.edu/bw • Or send email to • iepm-l@slac.stanford.edu • Also • Stanislav Shalunov’s talk at Boulder JT • LBL TCP Tuning Guide
The Team • Les Cottrell • Connie Logg • Jerrod Williams • Jiri Navratil • I Heng Mei • & me