130 likes | 214 Views
Network monitoring with R-GMA in EDG Paul Mealor Researcher. JRA4 R-GMA meeting, 22 July 2004. www.eu-egee.org. EGEE is a project funded by the European Union under contract IST-2003-508833. Contents. Disclaimer Monitoring tools and modifications Probes Coordination Protocol
E N D
Network monitoring with R-GMA in EDGPaul MealorResearcher JRA4 R-GMA meeting, 22 July 2004 www.eu-egee.org EGEE is a project funded by the European Union under contract IST-2003-508833
Contents • Disclaimer • Monitoring tools and modifications • Probes Coordination Protocol • Network Cost Estimation System • Schemas • Related work JRA4 R-GMA meeting, 22 July 2004 - 2
Disclaimer • R-GMA components have changed a lot since EDG • Streaming producer Primary producer (memory) • Consumer Consumer • Archive, database Well, it’s changed JRA4 R-GMA meeting, 22 July 2004 - 3
Monitoring tools and modifications • PingER • (which begat IperfER and UDPMon) • Wrapper script run periodically • By Cron, and latterly PCP • Runs ping, captures output • Output stored in flat text files • Web user interface • Displays graphs and tables • Also, unformatted data available Web UI Text files Measurement wrapper script Ping JRA4 R-GMA meeting, 22 July 2004 - 4
Monitoring tools and modifications Web UI netmonarchiverd DBProducer Archiver MySQL Archive Text files Consumer Measurement wrapper script netmon2rgmad Servlets on dedicated machine Ping StreamProducer JRA4 R-GMA meeting, 22 July 2004 - 5
Monitoring tools and modifications • GridFTP daemon • Can be set to write logs • ftlog2rgmad reads logs for updates • File gets longer: new entries at end • File gets shorter: everything’s a new entry • Publishes new entries via Stream Producer Log GridFTP daemon (gsiwuftpd) ftlog2rgmad Servlets on dedicated machine StreamProducer JRA4 R-GMA meeting, 22 July 2004 - 6
Probes Coordination Protocol • Full-mesh measurements impossible • Hand-built, centralised schedules don’t work • PCP automates scheduling • Token passed between nodes in a clique • Nodes with the token can make measurements • Many cliques can coexist • Allows hierarchies to be set up • ExternalLock option: no other commands while this one is running • edg-pcp-extern-lock program to lock a pair of hosts name:iperf member:ccwp7.in2p3.fr member:adc0003.cern.ch member:grid001f.cnaf.infn.it member:gppnm.gridpp.rl.ac.uk member:mon001.fzk.de period:1800 timeout:600 delay:60 option:ExternalLock command:edg-iperf iperf JRA4 R-GMA meeting, 22 July 2004 - 7
Fault tolerance Token should traverse the clique in period seconds If it left a node period + timeout seconds previously a new token is generated Protects against lost tokens Duplicate tokens are dropped Configuration updates Are easy: the token contains the configuration Probes Coordination Protocol (2) name:iperf member:ccwp7.in2p3.fr member:adc0003.cern.ch member:grid001f.cnaf.infn.it member:gppnm.gridpp.rl.ac.uk member:mon001.fzk.de period:1800 timeout:600 delay:60 option:ExternalLock command:edg-iperf iperf JRA4 R-GMA meeting, 22 July 2004 - 8
Network Cost Estimation System • Implemented for C++ and Java • Multiple backends • MySQL • R-GMA (ish :-/ ) • LDAP • Extra information: • Mapping tables: NetworkCE and NetworkSE • Match all SE/CEs to a Network Monitor netmon-rgma-info StreamProducer NetworkCost getNetworkCost(source,dest,filesize) NetworkCost[][] getNetworkCosts(sources[],dests[],filesize) JRA4 R-GMA meeting, 22 July 2004 - 9
Network Cost Estimation System • To be done: • Make use of the GridFTP data • Actually use R-GMA to query the Archive JRA4 R-GMA meeting, 22 July 2004 - 10
ComputingElement CEId GRAMVersion Architecture OpSys &c… NetworkCE CEId NMId NetworkTCPThroughput NMIdSource NMIdDestination tool bufferSize streams duration time value Schemas NetworkRTT ¥ NetworkCE NMIdSource NMIdDestination tool packetSize time minimum maximum average 1 ¥ CEId NMId ¥ 1 1 NetworkSE SEId NMId 1 JRA4 R-GMA meeting, 22 July 2004 - 11
Related work • Glue Schema • Almost identical schemas (as they were based on ours) • Measurements associated with a “Network Element” • Network Element is a path between two nodes, with a particular QoS and so on • NMWG • How do these XML schemas translate? • In WP7 • MapCenter (visualisation of host status) • TopoGrid (visualisation of paths) JRA4 R-GMA meeting, 22 July 2004 - 12
END • http://www.hep.ucl.ac.uk/~pdm/edg/docs/ • EDG WP7: http://ccwp7.in2p3.fr/ • PCP/GNMA: see WP7 final deliverables JRA4 R-GMA meeting, 22 July 2004 - 13