240 likes | 381 Views
INFN Experience With Globus GIS. A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania, 9-11 April 2001. Introduction. In a distributed environment like a Grid, one of the primary needs is to collect and retrieve resource information.
E N D
INFN Experience With Globus GIS A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania, 9-11 April 2001 A. Cavalli - F. Semeria INFN Experience With Globus GIS
Introduction • In a distributed environment like a Grid, one of the primary needs is to collect and retrieve resource information. • Whithin the Globus model the Grid Information Service (GIS) is the way of making information available to Grid application. A. Cavalli - F. Semeria INFN Experience With Globus GIS
The Globus GIS • Based on LDAP directory services. • A directory is similar to a database, but specialized in hierarchical information storage/retrieval. A. Cavalli - F. Semeria INFN Experience With Globus GIS
Why LDAP? • PROs: • it is a standard way to describe and collect data. • it provides a distributed topological model for the data. • CONs: • directories are designed more for reading than for writing. Good for a DNS, but not for storing dynamic data like the CPU load of a machine. A. Cavalli - F. Semeria INFN Experience With Globus GIS
Data & Schema Design • Define useful data to publish on GIS. • Manage in different ways dynamic & static data. • Take advantage of flexibility of LDAP objectclasses’ schema. • Complete the description of resources in Globus schema. A. Cavalli - F. Semeria INFN Experience With Globus GIS
GIS=GRIS+GIIS Globus 1.1.3 implements the GIS by using two kinds of LDAP servers: • GRIS (Grid Resource Information Service) runs on each resource (machine). ItsLDAP uses a shell backend to gather the resource configuration and status. It registers itself to a GIIS providing info about itself. • GIIS (Grid Index Information Service): LDAP server that runs on an organizational server that collects and caches information provided by GRIS’s registered under it. A. Cavalli - F. Semeria INFN Experience With Globus GIS
Registration & Data Collecting • GRIS & GIIS send a registration (LDAP) to their upper Index Server every 5 mins. • When queried, a GIIS: • scans regs • kicks expired ones • collect data from registered resources • returns collected & valid cached data A. Cavalli - F. Semeria INFN Experience With Globus GIS
Caching • Information are pulled by higher level GIISes from lower GIIS/GRIS resources upon a request. • Information are stored in cache for a period of time (TTL=Time To Live). • Higher the level of GIIS higher the TTL, lower the details. • Access control needed to let store only static data on higher levels. A. Cavalli - F. Semeria INFN Experience With Globus GIS
Extending the GRIS • The GRIS uses programs called information providers to collect information from the machine. • The requirements for an information provider are: • the program must emit LDIF objects to stdout • the object generated must respect the GLOBUS schema A. Cavalli - F. Semeria INFN Experience With Globus GIS
Resource Discovery(proposal) • Top Level: get possible candidates using static data. • Mid Level: narrow the search on local Index Servers. • Resource Level: finalize the search using data available only at this level. A. Cavalli - F. Semeria INFN Experience With Globus GIS
Performance • In the worst case the whole set of machines must be queried. • Some indexing techniques should be used to implement search space pruning(currently the GIIS backend always fetches data for every registered host). • Also a periodic information update mechanism can be investigated. A. Cavalli - F. Semeria INFN Experience With Globus GIS
Fault Tolerance • Replication of Index Servers data must be implemented (for now at the root level with Netscape LDAP server). • Replica servers could be available through a DNS based mechanism. A. Cavalli - F. Semeria INFN Experience With Globus GIS
Security and access policies • In the current implementation any machine can register itself to a GIIS • No access control when searching the GIIS. From any ldap client I can: ldapsearch –h mds.infn.it –p 389 –s sub –b “o=grid” “objectclass=*” and get all the information from the GIIS A. Cavalli - F. Semeria INFN Experience With Globus GIS
INFN implementation • INFN has implemented a hierarchical structure of GIIS based on INFN departments (about 25). • Each GRIS registers itself to the site GIIS which in turn registers itself to the top level INFN GIIS. A. Cavalli - F. Semeria INFN Experience With Globus GIS
Top level GIIS dc=infn,dc=it,o=grid GIIS Bologna GIIS Milano GRIS GRIS dc=bo,dc=infn,dc=it,o=grid dc=mi,dc=infn,dc=it,o=grid A. Cavalli - F. Semeria INFN Experience With Globus GIS
MDS Browser http://bond.cnaf.infn.it/ cgi-bin/mdsbrowse1.pl A. Cavalli - F. Semeria INFN Experience With Globus GIS
Experiments resources • Each GRIS can register itself to different GIIS’s. • This allows repartitioning of resources by experiment. A. Cavalli - F. Semeria INFN Experience With Globus GIS
Experiment resources: topology CERN CERN CMS EXPERIMENT GIIS “ou=cms, o=Grid” INFNGIIS “dc=infn, dc=it, o=Grid” INFN CMS EXPERIMENT GIIS “dc=infn, dc=it, ou=cms, o=Grid” MILANO GIIS “dc=mi, dc=infn…” BOLOGNA GIIS “dc=bo, dc=infn…” GRIS A. Cavalli - F. Semeria INFN Experience With Globus GIS
Some tests • We have tested the performance dependency from caching and cpu load. • Test have been made on WAN. • The same queries on a GIIS take: ~ 1 sec. when cache is on ~ 10 sec. or more when expired. A. Cavalli - F. Semeria INFN Experience With Globus GIS
Some tests (2) • When a GRIS has a loaded CPU the response time from its own GIIS is much longer when the cache is expired (>1min. vs 1 sec.) • Also when a GIIS has a loaded CPU and the cache is not expired the response time is longer (6-7 sec.): it happens with GIIS also used for computation… A. Cavalli - F. Semeria INFN Experience With Globus GIS
Some tests (3) We have also compared the Globus MDS with the OpenLDAP/LDBM server: with the same set of data, LDAP/LDBM response times are slower. A. Cavalli - F. Semeria INFN Experience With Globus GIS
Conclusions • The Globus Information Service is based on a standard protocol (LDAP). • It provides flexibility and a potentially good distributed data model. • But... A. Cavalli - F. Semeria INFN Experience With Globus GIS
Improvements & further studieshave to be done mainly: • Performance: • more efficient backend • push/pull model • Security: authentication & ACLs • Data: more resource attributes & data typing • Topology: more flexible than the current hierarchical-geographical one A. Cavalli - F. Semeria INFN Experience With Globus GIS