200 likes | 396 Views
Information System on gLite middleware. Vincent Bloch CNRS-IN2P3 ACGRID School Hanoi (Vietnam) November 5th, 2007 Credits: Valeria Ardizzone and other EGEE colleagues. Information System. What is? System to collect information on the state of resources Why?
E N D
Information System on gLite middleware Vincent Bloch CNRS-IN2P3 ACGRID School Hanoi (Vietnam) November 5th, 2007 Credits: Valeria Ardizzone and other EGEE colleagues
Information System • What is? • System to collect information on the state of resources • Why? • To discover resources of the grid and their nature • To have useful data that helps who is in charge of managing the workload to do it more efficiently. • To check for health status of resources. • How? • Monitoring state of resources locally and publishing right information on the information system. • Adopting a data model that MUST be well known to all components that want to access monitored information • Using different approaches that we are going to investigate in next slides
Design of Information Systems • About Measures • Measures SHOULD be sensitive to the aim the users want to achieve. • Measures SHOULD be enough accurate to be considered valid. • Rate of taking measures MUST be adequate to be used. • About the gathering of Information • How and when collected info should be published? • Where should collected info be stored? • How long should this info be maintained in the storage? • Querying the Information System • Where should queries be sent to have a response? • What syntax and protocols have to be adopted to make queries? • What is the adopted data model to describe resources? • Security • Who is allowed to execute queries against the IS and what type of queries is he allowed to do? • Management of user rights and credentials.
Adopted Information Systems • The BDII (Berkley DB Information Index) • has been adopted in LCG middleware as the Information System provider. • It is an evolution of the Globus Monitoring and Discovery Service (MDS) • LCG-2 actually adopts BDII as Information System. • It is based on Lightweight Directory Access Protocol (LDAP) servers. • The Relational Grid Monitoring Architecture (R-GMA) • Is an implementation of the Grid Monitoring Architecture (GMA) standardized by the Global Grid Forum (GGF) • It is a relational implementation of the GMA • It is strongly Web Services Oriented • It will be adopted by next releases of the gLite middleware
Lcg Information System
LCG Information System • LCG adopts a combination of solutions • Globus MDS • At the lowest level of the information system • To discover and monitor resources and publish information • Grid Information Security (GSI) credentials • Caching • BDII • At the highest level of the system • Because MDS had some troubles in terms of scalability • Used by the Resource Broker for the matchmaking process • Can be configured by each VO • Queries underlying systems periodically (2 minutes) • Hierarchical system • Information is collected on the leaves of a hierarchical tree and travels towards the root • Clients can query the hierarchical tree at every level • The higher the level against which queries are made, the older is the obtained information
Collecting Information • Gathering of information at different levels • Lower level: Grid Resource Information Server (GRIS) • Collects information on the state of a given resource • One GRIS on top of each resource • A set of scripts and sensor that try to extract useful info on the resource • Medium level: Grid Index Information Server (GIIS) • Collects information on resources of a given site • One GIIS for each site • Higher level: BDII • Collects information on resources of a given VO • One BDII for each VO (suggested solution) • Way of collecting info • Pull model (higher level servers periodically query lower level servers) • LDAP query model
Globus MDS (The past) • Globus Meta Directory Server (MDS) • It is a hierarchical system • Based on LDAP servers • GRISes are leaves of the tree • GIISes are intermediate nodes of the tree • The user can query the system at every level • The higher the information is in the tree, the older it is • Grid Resource Information Service (GRIS) • One for each Grid Resource (CE or SE) • Collects info on that resource • Static or dynamic info • Adopts techniques to take measures (such as sensors) • Grid Index Information Service (GIIS) • One for each site • Collects info from above GRISes • Caches info according to its validity time • Queries above GRISes or GIISes whether needed GIIS CERN root Globus Monitoring and Discovery Service National GIIS National GIIS GIIS GIIS GIIS GIIS GRISes GRISes GRISes
BDII GIIS GIIS GIIS GRISes GRISes GRISes BDII (the present) • The Berkley Database Information Index (BDII) • Developed within the context of LCG project • Solves problems of instability of the MDS occurring when the number of sites grows too much • Stays on top of GIIS sites • One for each VO • Centralized system • Three levels of hierarchy • Accessed by the Workload Management System • Way of working • One GRIS for each resource • One GIIS for each site collecting info from below GRIS systems • One BDII for a given VO collecting information from below GIIS systems • Two LDAP servers, one for write access and one for read access • Every two minutes a cron-job runs a script and collects info from a list of GIIS sites • The list of GIIS is placed in the configuration file of the BDII
R-GMA (the future) • The Relational Grid Monitoring Architecture (R-GMA) • It is the relational implementation of GMA defined by the GGF • Adopts a database model with tables and relations between tables • Implements a virtual database • The user queries the R-GMA as he/she was querying to a classical database (SQL string) • Implements different type of queries • The information • Produced and accessed locally to its site • Always updated • Can be collected by an entity (secondary producer) to be accessed faster (cache)
GLUE Schema • Grid Laboratory Uniform Environment (GLUE) Schema • It is a data model to describe in a meaningful way information on grid resources (static and dynamic info) • As result of a collaboration between the EU-DataTAG and iVDGL projects • EGEE, NorduGrid, LCG and Grid3/OSG contributed to the definition of the schema • XML Schema • Now, GLUE Schema is being mapped to an XML representation • http://infnforge.cnaf.infn.it/glueinfomodel/Spec/V12/R1
lcginfo • -h/--help: to see the help • –list-attrs: print the list of the possible attributes • --list-ce: lists the CEs which satisfy a query, or all the CEs if no query is given. • --list-se: lists the SEs which satisfy a query, or all the SEs if no query is given. • --bdii: allows to specify a BDII in the form <hostname>:<port>. If not given, the value of the environmental variable LCG_GFAL_INFOSYS is used. If that is not defined, the command returns an error. • --vo: restricts the output to CEs or SEs where the given VO is authorized. To change: View -> Header and Footer
lcginfosites • -h/--help: help option • --vo: VO name (mandatory) • --is: it's possible to specify a not default Top BDII • Some options: • se: The names of the SEs supported by the user's VO • ce: The information relative to number of CPUs, running jobs,etc. • rb: Names of the Rbs available for each VO • sitenames: Names of the LCG sites • tag: The names of the tags relative to the software installed in site is printed together with the corresponding CE • closeSE: The names of the CEs where the user's VO is allowed to run together with their corresponding closest SEs are provided To change: View -> Header and Footer
References • GLITE 3 User Guide • https://edms.cern.ch/file/722398/gLite-3-UserGuide.pdf • GLUE Schema • http://infnforge.cnaf.infn.it/glueinfomodel/ • EGEE Library • http://egee.lib.ed.ac.uk/