170 likes | 193 Views
Assessing INFN's Globus activities, exploring basic GRID services, resource management, data access, fault monitoring, and more for future planning. Evaluation of GRAM, GIS, and security aspects included.
E N D
Status of Globus activities within INFN Massimo Sgaravatto INFN Padova for the INFN Globus group globus@infn.it
Globus @ INFN • WP “Installation and Evaluation of the Globus Toolkit” of the INFN-GRID Project (WP 1) • Goal: evaluate the Globus toolkit as a GRID framework providing basic services • Which services can be useful ? • What is necessary to integrate/modify ? • What is missing ? • Duration: 6 months • Results of this first evaluation used to plan future activities
Globus • Project led by Ian Foster and Carl Kesselman • Basic research on GRID (resource management, security, QoS, ...) • Development of Globus Toolkit • Core services for GRID tools and applications
Local Services Condor MPI TCP UDP LSF Easy NQE AIX Irix Solaris Globus Architecture Applications High-level Services and Tools GlobusView Testbed Status DUROC MPI MPI-IO CC++ Nimrod/G globusrun Core Services Nexus GRAM Metacomputing Directory Service Globus Security Interface Heartbeat Monitor Gloperf GASS
Tasks • Security • To access GRID resources mechanisms for user authentication needed • Evaluation of GSI service • Information Service • To discover the GRID resources (CPU, storage, network, …) mechanisms to “publish” them must be defined • Analysis of GIS service to “publish” information using a uniform and standard interface • Resource Management • Necessary a uniform interface to submit jobs on GRID resources • Uniform standard interface to different resource management systems • Uniform standard language for task management • Assessment of Globus GRAM service for resource allocation and process management
Tasks • Data Access and Migration • High performance and reliable tools needed to “manage” data (data transfers, wide area replica, …) • Assessment of Globus tools for data management (GASS, Globusftp, Replica Management tools) • Fault Monitoring • Faults in a GRID environment must be promptly detected and recovery mechanisms must be implemented • Evaluation of HBM service for fault detection • Execution Environment Management • Code migration (moving the application where the job will actually be executed) as a possible implementation strategy • Evaluation of GEM service to support code migration • Globus deployment • Reduce complexity and manpower for Globus installation and maintenance
Status Globus installed on ~ 30 machines in 11 sites TRENTO UDINE MILANO TORINO PADOVA LNL TRIESTE FERRARA PAVIA GENOVA PARMA CNAF BOLOGNA PISA FIRENZE S.Piero PERUGIA LNGS ROMA L’AQUILA ROMA2 LNF SASSARI NAPOLI BARI LECCE SALERNO CAGLIARI COSENZA PALERMO CATANIA LNS
Security (GSI) • Already done: • Evaluation of the Globus security architecture • We like the “one time login” paradigm, but some improvements needed • Globus certificates (for hosts and users) signed by INFN certification authority • On-going activities: • Definition and implementation of architecture of CAs • Up to task force of the European DataGrid project • Periodic update of CRL • “Management” of grid-mapfile (where the mappings between local users and GRID users are defined) updates • I.e.: a certain Globus resource must be available to all members of a specific physics group
Information Service (GIS) • Already done: • INFN MDS server serving Globus 1.1.1 and 1.1.2 installations (single LDAP server) • Lot of problems using the “default” American MDS server • Definition and implementation of test architecture of GIS for Globus 1.1.3 installations (distributed model) • Web interface for browsing • On-going activities: • Improvement of performance (Netscape LDAP server as top level GIIS) • Tests on performance and scalability • Results used to define and implement the GIS architecture • Review the information gathered from the various machines and published in the GIS
GIS Architecture (test phase) Dc=infn,dc=it, o=grid Top Level INFN GIIS Implemented Implemented using INFNGRID distribution To be implemented Exp=atlas, o=grid Dc=bo, Dc=infn, dc=it,o=grid INFN ATLAS GIIS Dc=mi,Dc=infn, dc=it,o=grid GIIS GIIS GRIS Milano Bologna
Resource Management (GRAM) • Already done: • Job submission tests using Globus tools with real applications and in real production environments (GRAM as uniform interface to different underlying resource management systems [LSF, Condor, PBS]) • Some bugs found and fixed • Many many memory leaks !!! • … • Some bugs can be solved without major re-design and/or re-implementation • Two major problems: • Scalability • Fault tolerance • Submission of Condor jobs to Globus resources (Condor-G and GlideIn) • Evaluation of RSL as uniform language to specify resources • More flexibility is required • Resource administrators should be allowed to define new attributes and users should be allowed to use them in resource specification expressions (Condor Class-Ads model) • Cooperation” between GRAM and GIS • The information on characteristics and status of local resources and on jobs is not enough (as local resources we must consider Farms) • The default schema must be integrated with other info provided by the underlying resource management systems or by specific agents
Resource Management (GRAM) • On going activities: • Tests with GRAM API • Identity a set of useful attributes of a Condor pool, LSF cluster, PBS cluster that should be reported to the GIS, and integrate the default schema • Tests with MPICH-G2
Globus deployment • Already done: • INFN-GRID 1.0 • Non-precompiled Globus 1.1.3 + bug fixes • Installation instructions (in particular for INFN customizations) • INFN-GRID 1.1 • Precompiled Globus 1.1.3 for Linux Red Hat 6.x • gsiwuftpd • Support for LSF and Condor as underlying resource management systems • Possibility to implement INFN customizations • Certificates signed by INFN CA • Preliminary architecture for GIS • Installation instructions • INFN-GRID 1.2 • Besides INFN-GRID 1.1’s functionalities • Support for Solaris 2.6 • Support for PBS as resource management system • Support for GDMP (for Linux) • Tool to upgrade INFN-GRID 1.1 INFN-GRID 1.2 • Installation instructions
Globus deployment • On-going activities: • Web software repository • INFN-GRID 1.3 • Fixes for Globus jobmanager memory leaks • Support for Solaris 7 • Full support for GDMP • Distribution of various Globus compilations (Kerberos, MPICH-G2) • INFN-GRID toolkit available to DataGrid partners • Globus team interested to this toolkit
Data Management • Already done: • Preliminary tests with GASS and gsiftp • To do: • Tests with GlobusFTP and Replica Catalog Software (Globus Data Grid Alpha Release 2)
Other tasks • Fault Monitoring (HBM) • Evaluation of HBM for fault detection (for “system” and “user” processes) • Data collectors (implementing automatic recovery mechanisms) • … but the HBM package is not seeing active development • Execution Environment Management (GEM) • Evaluation of GEM as service for code migration • … but the GEM service now provides only limited capabilities (executable staging)
Other info • http://www.infn.it/globus