220 likes | 354 Views
The gLite Information System(s). Domenico Vicinanza, CERN EELA Tutorial, Santiago, September 2006. Information System. What? System to collect information on the state of resources Why? To discover resources of the grid and their nature
E N D
The gLite Information System(s) Domenico Vicinanza, CERN EELA Tutorial, Santiago, September 2006
Information System • What? • System to collect information on the state of resources • Why? • To discover resources of the grid and their nature • To have useful data to know who is in charge of managing the workload to do it more efficiently. • To check for health status of resources. • How? • Monitoring state of resources locally and publishing fresh data on the information system. • Adopting a data model that MUST be well known to all components that want to access monitored information • Using different approaches that we are going to investigate in the next slides Santiago, Chile, EELA Tutorial, 06-07.09.2006
Uses of the IS in Grid If you are a middleware developer Workload Management System: Matching job requirements and Grid resources Monitoring Services: Retrieving information of Grid Resources status and availability If you are a user Retrieve information of Grid resources and status Get the information of your jobs status If you are site manager or service You “generate” the information for example related to your site or to a given service Santiago, Chile, EELA Tutorial, 06-07.09.2006
LCG Information System • LCG adopts a combination of solutions • Globus MDS • At the lowest level of the information system • To discover and monitor resources and publish information • Grid Information Security (GSI) credentials • Caching • BDII • At the highest level of the system • Because MDS had some troubles in terms of scalability • Used by the Resource Broker for the matchmaking process • Can be configured by each VO • Queries underlying systems periodically (2 minutes) • Hierarchical system • Information is collected on the leaves of a hierarchical tree and travels towards the root • Clients can query the hierarchical tree at every level • The higher the level against which queries are made, the older is the obtained information Santiago, Chile, EELA Tutorial, 06-07.09.2006
Information Systems in gLite • The BDII (Berkeley DB Information Index) • has been adopted in LCG middleware as the Information System provider. • It is an evolution of the Globus Meta Directory System (MDS) • LCG-2 actually adopts BDII as Information System. • It is based on Lightweight Directory Access Protocol (LDAP) server • The Relational Grid Monitoring Architecture (R-GMA) • Is an implementation of the Grid Monitoring Architecture (GMA) standardized by the Global Grid Forum (GGF) • It is a relational implementation of the GMA • It is strongly Web Services Oriented • It uses standard SQL query syntax Santiago, Chile, EELA Tutorial, 06-07.09.2006
Collecting Information • Gathering of information at different levels • Lower level: Grid Resource Information Server (GRIS) - MDS • Collects information on the state of a given resource • One GRIS on top of each resource: CE, SE, RB, MyProxy • A set of scripts and sensors that try to extract useful info on the resource • Medium level: Grid Index Information Server (GIIS) – Local BDII • Collects information on resources of a given site • One GIIS for each site • Higher level: Top-level BDII • Collects information on resources of a given VO • One BDII for each VO (suggested solution) • Way of collecting info • Pull model (higher level servers periodically query lower level servers) • LDAP query model Santiago, Chile, EELA Tutorial, 06-07.09.2006
The hierarchy • Way of working • One GRIS for each resource • One GIIS for each site collecting info from below GRIS systems • One BDII for a given VO collecting information from below GIIS systems • Two LDAP servers, one for write access and one for read access • Every two minutes a cron-job runs a script and collects info from a list of GIIS sites • The list of GIIS is placed in the configuration file of the BDII Santiago, Chile, EELA Tutorial, 06-07.09.2006
The LDAP Protocol ► LDAP structures data as a tree ► The values of each entry are uniquely named ► Following a path from the node back to the root of the DIT, a unique name is built (the DN): “id=dv,ou=IT,or=CERN,st=Geneva, \ c=Switzerland,o=grid” o = grid (root of the DIT) c= US c=Switzerland c=Spain st = Geneva or = CERN ou =IT ou = EP objectClass:person cn: Vicinanza D. phone: 5555666 office: 28-r026 id = dv id=gv id=fd Santiago, Chile, EELA Tutorial, 06-07.09.2006
R-GMA • The Relational Grid Monitoring Architecture (R-GMA) • It is the relational implementation of GMA defined by the GGF • Adopts a database model with tables and relations between tables • Implements a virtual database • The user queries the R-GMA as he/she was querying to a classical database (SQL string) • Implements different type of queries • The information • Produced and accessed locally to its site • Always new • Can be collected by an entity (secondary producer) to be accessed faster Santiago, Chile, EELA Tutorial, 06-07.09.2006
GMA Architecture and Relational Model • The Producer stores its location (URL) in the Registry. • The Consumer looks up producer URLs in the Registry. • The Consumer contacts the Producer to get all the data. • Or the Consumer can listen to the Producer for new data. Registry Store Location Look up Location Producer Consumer Execute or Stream data SELECT * FROM people WHERE group=‘HR’ Santiago, Chile, EELA Tutorial, 06-07.09.2006
Multiple Producers • The Consumer will get all the URLs that could satisfy the query. • The Consumer will connect to all the Producers. • Producers that can satisfy the query will send the tuples to the Consumer. • The Consumer will merge these tuples to form one result set. Registry Producer 1 Producer 2 Consumer Santiago, Chile, EELA Tutorial, 06-07.09.2006
Select * from CPULoad Santiago, Chile, EELA Tutorial, 06-07.09.2006
Joins SELECT Service.URI Service.emailContact FROM Service S, ServiceStatus SS WHERE (S.URI= SS.URI and SS.up=‘n’) Santiago, Chile, EELA Tutorial, 06-07.09.2006
GLUE Schema Santiago, Chile, EELA Tutorial, 06-07.09.2006
Definition and main goals • Schema: a description of objects and attributes needs to describe Grid resources, and the relationships between the objects. Main goals: • Define a minimum common schema requirement for interoperability • Compute Elements, Network Elements, Storage Elements • To address need to common schemas between projects • framework independent (LDAP, SQL, XML) Santiago, Chile, EELA Tutorial, 06-07.09.2006
Glue Schema • Grid Laboratory Uniform Environment (GLUE) Schema • It is a data model to describe in a meaningful way information on grid resources (static and dynamic info) • As result of a collaboration between the EU-DataTAG and iVDGL projects • EGEE, NorduGrid, LCG and Grid3/OSG contributed to the definition of the schema • XML Schema • Now, GLUE Schema is being mapped to an XML representation • http://infnforge.cnaf.infn.it/glueinfomodel/Spec/V12/R1 Santiago, Chile, EELA Tutorial, 06-07.09.2006
Example of attibutes • Operating System • OSName • OSRelease • OSVersion • QueueState • RunningJobs • TotalJobs • QueueStatus • WaitQueueLength • WorstResponseTime • EstimatedResponseTime Santiago, Chile, EELA Tutorial, 06-07.09.2006
Site Element Santiago, Chile, EELA Tutorial, 06-07.09.2006
Cluster Element Santiago, Chile, EELA Tutorial, 06-07.09.2006
Computing Element Santiago, Chile, EELA Tutorial, 06-07.09.2006
References • gLite 3.0 User Guide • https://edms.cern.ch/file/722398/1.1/gLite-3-UserGuide.pdf • R-GMA home page • http://www.r-gma.org/ • GLUE Schema • http://infnforge.cnaf.infn.it/glueinfomodel/ Santiago, Chile, EELA Tutorial, 06-07.09.2006
Questions… Thanks to Roberto Barbera who firstly developed these slides Santiago, Chile, EELA Tutorial, 06-07.09.2006