210 likes | 360 Views
INFN experience with Globus GIS. A. Cavalli - F. Semeria INFN Grid Information Services workshop CERN, 28-29 March 2001. Introduction. In a distributed environment like a Grid, one of the primary needs is to collect and retrieve resource information.
E N D
INFN experience with Globus GIS A. Cavalli - F. Semeria INFN Grid Information Services workshop CERN, 28-29 March 2001
Introduction • In a distributed environment like a Grid, one of the primary needs is to collect and retrieve resource information. • Whithin the Globus model the Grid Information Service (GIS) is the way of making information available to Grid application.
GIS=GRIS+GIIS • Globus 1.1.3 implements the GIS by using two kinds of LDAP servers: • GRIS (Grid Resource Information Service) runs on each resource (machine). It uses an LDAP shell backend to gather the resource configuration and status. It registers itself to a GIIS providing info about itself. • GIIS (Grid Index Information Service): LDAP server that runs on an organizational server that collects and caches information provided by GRIS’s registered under it
INFN implementation • INFN implemented a hierarchical structure of GIIS based on INFN departments (about 25) • Each GRIS registers itself to the site GIIS which in turn registers itself to the top level INFN GIIS
Top level GIIS dc=infn,dc=it,o=grid GIIS Bologna GIIS Milano GRIS GRIS dc=bo,dc=infn,dc=it,o=grid dc=mi,dc=infn,dc=it,o=grid
Information flow • Information is not pushed periodically from a GRIS to a GIIS, but is the GIIS that queries the GRISes when an application needs information • Information is stored in cache for a period of time (TTL=Time To Live) • The higher the level of GIIS, the higher the TTL, the lower the level of detail
INFN GIS implementation • 11 GIIS’s registered • More than 40 GRIS’s • Its content is browsable via the URL http://bond.cnaf.infn.it/cgi-bin/mdsbrowse1.pl
GIS for DataGrid testbed • The proposed implementation for the DataGrid testbed is to have a hierarchical structure of GIIS having a root server at CERN. • Each organization has its top level GIIS registered on the root server, but can choose its own low level topology.
CERN ROOTGIIS “O=Grid” INFNGIIS “dc=infn, dc=it, o=Grid” LIPGIIS “dc=lip, dc=pt, o=Grid” FRENCH GIIS “dc=fr, o=Grid” … IN2P3GIIS “dc=in2p3. dc=fr, o=Grid” … DEP. / LAB. GIIS “dc=xx, dc=infn…” DEP. / LAB. GIIS “dc=yy, dc=infn…” DEP. / LAB. GIIS “dc=zz, dc=infn…” … DEP. / LAB. GIIS “ou=xy, dc=in2p3…” GRISes
Experiments’ resources: topology … CERN CMS EXPERIMENT GIIS “ou=cms, o=Grid” INFNGIIS “dc=infn, dc=it, o=Grid” INFN CMS EXPERIMENT GIIS “dc=infn, dc=it, ou=cms, o=Grid” MILANO GIIS “dc=xx, dc=infn…” BOLOGNA GIIS “dc=yy, dc=infn…” GRISes
Experiments’ resources: “howto” {deploy}/etc/grid-info-site.conf -------------------------------- (…) # this entry is for access-control only. # it uses all the global configuration data set above. dn: service=MDS Resource, hn=*, service=MDS Registration, dc=*, dc=infn, dc=it, o=Grid --------------------------------
Some tests • We have tested the performance dependency from caching and cpu load. • Tests have been made on WAN. • The same queries on a GIIS take < 1 sec. when cache is on and > 10 sec. when off
Some tests (cont.) • When a GRIS has a loaded CPU the response time from its own GIIS is much longer when the cache is expired (> 1 min. vs 1 sec.) • Also when a GIIS has a loaded CPU and the cache is not expired the response time is longer (6-7 sec.): it happens with GIIS also used for computation…
Performance • In the worst case the whole set of machines must be queried. • Some indexing techniques should be used to implement search space pruning(currently the GIIS backend always fetches data for every registered host). • Also a periodic information update mechanism can be investigated.
Security and access policies • In the current implementation any machine can register itself to a GIIS • No access control when searching the GIIS. From any ldap client I can: ldapsearch –p 389 –h mds.infn.it –b “o=grid” –s sub “*=*” and get all the information from the GIIS
Conclusions • The Globus Information Service is based on a standard protocol (LDAP). • It provides flexibility and a potentially good distributed data model. • But...
Conclusions (cont.) • A good topology for the HEP experiments must be still implemented • The GRIS must be extended with new information providers • Lack of server redundancy/replication • Performance & security must be improved • Superior knowledge: referral to upper GIIS not implemented • all the information is represented in text format -> no numerical comparison allowed
Work in progress & todo • GeneralizingGIS documentationfor DataGrid (see also: INFN kit 1.3). • Preparing to test the alpha release of the new MDS infrastructure: • OpenLDAP 2.0 • GSI authentication • Improved backend performance • Investigating on LDAP: aliases, referrals, LDBM… • Data replication (with Netscape?)
Documentation • The documentation is currently on: www.infn.it/grid where pointers can be found for: • INFN Globus documentation (www.infn.it/globus) • INFN Globus toolkits distribution (www.pi.infn.it/grid/dist) • INFN testbed (www.infn.it/testbed-grid) • For testbed Information Service support: mailing list: is-datagrid@infn.it www: marianne.in2p3.fr