140 likes | 250 Views
Experiences with NMI R2 Grids Software at Michigan. Shawn McKee April 8, 2003 Internet2 Spring Meeting. Outline. Apps tested at Michigan A little about our environment and motivations Experiences for each application. Grid Components Tested at Michigan. Globus Condor-G NWS KX509
E N D
Experiences with NMI R2 Grids Software at Michigan Shawn McKee April 8, 2003 Internet2 Spring Meeting
Outline • Apps tested at Michigan • A little about our environment and motivations • Experiences for each application
Grid Components Tested at Michigan • Globus • Condor-G • NWS • KX509 • GSI OpenSSH • GridConfig
MGRID – www.mgrid.umich.edu • A center to develop, deploy, and sustain an institutional grid at Michigan • Many groups across the University participate in compute/data/network-intensive research grants – increasingly Grid is the solution • ATLAS, NPACI, NEESGrid, Visible Human, NFSv4, NMI • MGRID allows work on common infrastructure instead of custom solutions
MGRID: Goals • Provide participating units knowledge, support and a framework to deploy Grid technologies • Provide test bench for existing and emerging Grid technologies • Coordinate activities within the national Grid community (GGF, GlobusWorld, …) • Provide a context for the University to invest in computational and other Grid resources
Tier2 Center Tier2 Center Tier2 Center Tier2 Center Tier2 Center HPSS HPSS HPSS HPSS Data Grids for High Energy Physics CERN/Outside Resource Ratio ~1:2Tier0/( Tier1)/( Tier2) ~1:1:1 ~PByte/sec ~100 MBytes/sec Online System Offline Farm,CERN Computer Ctr ~25 TIPS Tier0 +1 ~2.5 Gbits/sec HPSS Tier 1 France Italy UK BNL Center Tier 2 ~2.5 Gbps Tier 3 Physicists work on analysis “channels” Each institute has ~10 physicists working on one or more channels Institute ~0.25TIPS Institute Institute Institute 100 - 1000 Mbits/sec Physics data cache Tier 4 Workstations
ATLAS Grid Testbed (US) • 10 sites • University groups: BU, IU, UM, NM, OU, SMU, UTA • Labs: ANL, BNL, LBNL • 15-20 users • All sites: • Globus & Condor • AFS, ATLAS software release • Dedicated resources • Accounts for most users on all machines • Applications: • Monte Carlo production w/ legacy code • Athena controlled Monte Carlo
Globus Experiences • We had already been using Globus since V1.1.3 for our work on the US ATLAS testbed • The NMI release was nice because of the GPT packaging which made installation trivial. • There were some issues with configuration and coexistence: • Had to create a separate NMI gatekeeper to not impact our production grid users • No major issues found…Globus just worked
Condor-G • Condor was already in use at our site and in our testbed. • Condor-G installed over existing Condor installations produced some problems: • Part of the difficulty was not understanding the details of the difference between Condor and Condor-G • A file ($LOG/.schedd_address) was owned by root rather than the condor user and this “broke” Condor-G. Resolved via the testbed support list
Network Weather Service (NWS) • Installation was trivial via GPT (server/client bundles) • Interesting product for us. We have done significant work with monitoring. • NWS advantages: • Easy to automate network testing, once you understand the config details • Prediction of future value of resources is fairly unique and potentially useful for grid scheduling • NWS disadvantages: • Difficult user interface (relatively obscure syntax to access measured/predicted data)
KX509 • This application was developed at Michigan and was used at a testbed level until recently • Michigan is a Kerberos site • MGRID wants to use KX509 for all “certificates” within campus. • We were unable to get KX509 to work University-wide at Michigan… • Problem was a bad “CREN” root certificate complicated by insufficient error checking/handling in the Globus code. • Should have a “fixed” root certificate shortly…
GSI OpenSSH • Useful program to extend functionality of PKI to OpenSSH. • Allows “automatic” interactive login to proxy holders based upon Globus mapfile entries • Simple to install---In principle a superset of OpenSSH on the server end • We had a problem with a conflict in dynamic libraries which it installs on a non-NMI host
GridConfig • We tested GridConfig to determine how useful such a system would be for our needs • General impression was that this is a potentially useful tool. • Would like to see: • Improved config checking capability • More awareness of application interactions and config dependencies between applications
Conclusions • Applications were all easy to install via GPT • Configuration details are still not that easy, but tools like GridConfig should help in the long term • We hope to do much more detailed testing with future releases and are already planning to build applications and our environment for MGRID on the NMI release