170 likes | 359 Views
Michigan Grid Testbed Report. Shawn McKee University of Michigan UTA US ATLAS Testbed Meeting April 4, 2002. Michigan Grid Testbed Layout. Grid Machine Details. Network monitoring and testing Security related tools and configuration Crash-dump testing for Linux Web100 testing
E N D
Michigan Grid Testbed Report Shawn McKee University of Michigan UTA US ATLAS Testbed Meeting April 4, 2002
Michigan Grid Testbed Layout Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg
Grid Machine Details Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg
Network monitoring and testing Security related tools and configuration Crash-dump testing for Linux Web100 testing MGRID initiative (sent to UM Administration) MJPEG video boxes for videoconferencing UM is now an “unsponsored” NSF SURA Network Middleware Initiative Testbed site Authenticated QoS signaling and testing Grid Related Activites at UM Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg
Web100 Experience • We upgraded many of our nodes kernels to 2.4.16 and then applied the Web100 patches (alpha release) • The goal is to provide network tuning and debugging info and tools by instrumenting low level code in the TCP stack and kernel • Our experience has been mixed: • Nodes with patches crash every ~24-36 hours • Application monitoring tools don’t all work • Difficult to have a non-expert get anything meaningful from the tools • Recommendation is to wait for a real release! Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg
Iperf/Network Testinghttp://atgrid.physics.lsa.umich.edu/~cricket/cricket/grapher.cgi • We have been working on automated network testing and monitoring • Perl scripts have been used to run Iperf tests from LINAT01 (gatekeeper) to each other testbed sites gatekeeper using Globus. • We track UDP/TCP bandwidth, packet loss, jitter, buffer sizes for each “direction” between each pair of sites. • Results are recorded by Cricket and are available as plots for various time-frames • Problems with Globus job submissions at certain sites, automating restart of Perl scripts and “zombie” processes accumulating…needs better exception handling. • We separately use Cricket to monitor: • Round-trip times and packet losses using Ping • Testbed node details (load avg, cpu usage, disk usage, processes) using SNMP • Switch and router statistics using SNMP • Long term goal is to deploy hardware: monitors&beacons on testbed. Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg
MGRID (Michigan Grid Research and Infrastructure Development ) • Various colleges and units at UM are very interested in grids and grid technology • We have proposed formation of an MGRID center, funded by the University • Size is to be 3 FTEs plus a director with initial funding for three years • The MGRID Center is a cooperative center of faculty and staff from participating units with a central core of technical staff, who together will carry out the grid development and deployment activities at the UM. • US ATLAS grids would be a focus of such a center…we should find out about MGRID by July 2002 Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg
NMI Testbed Michigan has been selected as an “unsponsored” NMI testbed member. Goals are to: • Develop and release a first version of GRIDS and Middleware software • Develop security and directory architectures, mechanisms and best practices for campus integration • Put in place associated support and training mechanisms • Develop partnership agreements with external groups focused on adoption of software • Put in place a communication and outreach plan • Develop a community repository of NMI software and best practices Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg
NMI GRIDS NMI-GRIDS components: • Globus Toolkit 2.0 (Resource discovery and management, authenticated access to and scheduling of distributed resources, coordinated performance of selected distributed resources to function as a dynamically configured "single" resource.) • GRAM 1.5 • MDS 2.2 • GPT v.? • GridFTP • Condor-G • Network Weather Service • All services should accept x.509 credentials for authentication and access control. Much the same type of tools we already are using on our testbed Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg
NMI EDIT (Enterprise and Desktop Integration Technologies) NMI-EDIT components: The deliverables anticipated from NMI-EDIT for NMI Release 1 are of four types: • 10. Code - Code is being developed, adapted or identified for desktops (e.g. KX.509, openH.323, SIP clients) and for enterprise use (such as Metamerge connectors, Shibboleth modules for Apache, etc.). Code releases are generally clients, modules, plug-ins and connectors, rather than stand-alone executables. • 11. Objects - Objects include data and metadata standards for directories, certificates, and for use with applications such as video. Examples include eduPerson and eduOrg objectclasses, S/MIME certificate profiles, video objectclasses, etc. • 12. Documents - This includes white papers, conventions and best practices, and formal policies. There is an implied progression in that the basic development of a new core middleware area results in a white paper (scenarios and alternatives) intended to promote an architectural consensus as well as to inform researchers and campuses. The white paper in turn leads to deployments, which require in conventions, best practices and requisite policies. The various core middleware areas being worked within release 1 include PKI, directories, account management, and video. • 13. Services - “Within the net” operations are needed to register unique names and keys for organizations, services, etc. Roots and bridges for security and directory activities must be provided. Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg
Authenticated QoS Work • We have been working with CITI (Andy Adamson) at UM on issues related to QoS (Quality of Service) • This is a critical issue for grids and any applications which require certain levels of performance from the underlying network • A secure signaling protocol has been developed and tested…it is being moved into the GSI (Globus Security Infrastructure) • A “Grid Portal” application is planned to provide web based secure access to grids. • Our US ATLAS testbed could be a testing ground for such an application, if there is interest. Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg
Network Connectivity Diagram Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg
Future UM Network Layout Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg
Future Additions to the UM Grid • We have been working closely with others on campus in grid related activities • Tom Hacker(CAC/Visible Human) has asked us to install VDT 1.0 on two different installations on campus with significant compute resources. • We hope to test how we can use and access shared resources as part of the US ATLAS grid testbed • Primary issue is finding a block of time to complete the install and testing… Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg
Hardware Resources – Arbor Lakes • Linux Cluster – 100 processor Beawolf cluster equipment donated from Intel Corporation dual 800 Mhz Pentium III, 1GB RAM per node (512 MB per processor) 30 GB hard drive per node, Intel connect is Gigabit Ethernet. Computation Node 80 GB NSF Fileserver Node • Intel Copper Gigabit Ethernet • Adapter • 2 processors • 1 GB RAM • 30 GB Hard drive Computation Nodes Gigabit Ethernet Interconnect “Master Node” For login,text and job submission 42 TB Tivioli Mass Storage System Via NSF Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg
Hardware Resources – Media Union • AMD Linux Cluster – 100 AMD 1800+ processor – 2 per node – 1 GB Ram per node (512 MB per processor) – Interconnected with Mgnnect – Redhat Linux – Distributed Architecture Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg
To Do… • Install VDT 1.0, first at Arbor Lakes and Media Union, then upgrading our site • Get network details at each site documented • Start gigabit level testing to selected sites • Get crash dumps to actually work • Document and provide best practices on WWW site for networking (HENP+NSF?) and grid related software… • Determine how to leverage NMI testbed tools for USATLAS… Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg