320 likes | 460 Views
Introduction on R-GMA. Shi Jingyan Computing Center IHEP. Content. R-GMA R-GMA concept R-GMA components Accounting system. Relational Grid Monitoring Architecture -- introduction.
E N D
Introduction on R-GMA Shi Jingyan Computing Center IHEP
Content • R-GMA • R-GMA concept • R-GMA components • Accounting system
Relational Grid Monitoring Architecture -- introduction • Models the information infrastructure of a Grid as a set of Consumers (who request information), Producers (who provide information) and a single Registry (which mediates the communication between producers and consumers). • Impose a standard query language (a subset of SQL): producer publishes tuples with INSERT statement; consumer query tuple with SELECT statement. • All tuples carry a time-stamp to support monitoring system
R-GMA Introduction (cont.) • Architecture:
R-GMA Introduction (cont.) • the information resources of a VO is in a single virtual database containing a set of virtual table. • a single schema contains the name and structure of each virtual table in the system. • a single registry contains a list, for each table, of producers who have offered to publish rows for the table. • a consumer runs an SQL query against a table, and the registry selects the best producers to answer the query in a process called mediation. The consumer then contacts each producer directly, combines the information, and returns a set of tuples. • Mediation process is hidden from the user. There is no central repository holding the contents of the virtual table.
R-GMA Introduction (cont.) • Producers: • Primary producer: user's code periodically inserts tuples which is then stored internally by the producer. The producer answers consumer queries from its own storage. • Secondary producer: populates its own storage by running its own query against the virtual table. The user code only sets the process running; the tuples come from other producers. • On-demand Producer: no internal storage; data is provided by the user code in direct reponse to a query forwarded on to it by the producer service.
R-GMA Introduction (cont.) • Consumer: each consumer represents a single SQL SELECT query on the virtual database and obtain the answer tuple from the producer after the mediation. • Mediation: The query is first passed to the Registry to identify which producers, for each virtual table in the query, must be contacted to answer it. The process is called Mediation.
R-GMA Introduction (cont.) • Types of query • continuous query: all new tuples matched the query will be streamed into the consumer's tuple-storage as soon as they are inserted into the virtual table by the rpoducers. • One-time queries: • History-query: all versions of any matching tuples are returned. • Latest-query: only the tuples representing the ”current state” are returned. • Static query: database-like query and do not contain R-GMA time-stamps.
R-GMA Introduction (cont.) • Retention Periods: • LatestRetentionPeriod: is inserted into each tuple published by a Primary Producer and remains there when a tuple is re-published by a Secondary Producer. • HistoryRetentionPeriod: Producer declare a HistoryRetentionPeriod for each table to which they are publishing tuples. • A latest-query returns only those tuples which have not exceeded their LatestRetentionPeriod for the table. A history-query returns all versions of tuples which have not exceeded the producer's HistoryRetentionPeriod for the table.
R-GMA Introduction (cont.) • Web Service Architecture: • R-GMA conforms to the Web Services Architecture. • 6 principal services:Primary producer,Secondary producer,On-demand producer, Consumer, Registry and Schema • Each service has one WSDL document. • Message is used to communicate with the services. • Message sequence and format are also specified in WSDL.
R-GMA Introduction (cont.) • R-GMA uses ”SOAP messaging over http/s” in a request/response pattern.
Apel—accounting in LCG-2 • Apel software is composed of Apel Log Processor and Flexible archiver. • Apel Log Processor: parses log files to extract job information and publishes it using R-GMA. • Flexible Archiver:Located on the Grid Operation Center(GOC). Receive the data for the accounting table from all sites participating in the R-GMA configuration, it will contain an amalgamation of all accounting data from each site.
Apel Log Processor • used to parse GateKeeper and PBS event logs generated at a site. The extracted data is pieced together to form an accounting record detailing the owner of a submitted job with the resources used to excute the job itself. • Accounting records are then published using R-GMA. • Accounting records are then collated together into a centralised repository on the GOC using an R-GMA Secondary Producer.
Aple Log Processor (cont.) • parsed log files: • /var/log/globus-gatekeeper.log • /var/log/message • /var/spool/pbs/server_priv/accounting • Tables used in Apel • EventRecords • GkRecords • MessageRecords • SpecRecords • LcgRecords (published)
Examples – Two Servlets • The first one: • Provides a web page as the user interface. Create a consumer to show the statistic info from the accounting data on the date the user provides
Example – Two servlets (cont.) • The second example: • Create a primary producer to publish the statistic infomation of the accounting data which can be queried from the browser servlet provided by RGMA software package
IHEP Accounting plan • Pbs log file: /var/spool/pbs/server_priv/accounting • Perl program analyse log file to generate DB data • Java program uses producer to publish the necessary accounting info by joining DB data • Rgma server has registry function to maintain the virtual table • Summary accounting info with respect to user.
+-------------------------+-------------+------+-----+---------+-------++-------------------------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------------------------+-------------+------+-----+---------+-------+ | theDate | date | YES | | NULL | | | eventID | varchar(60) | YES | | NULL | | | siteName | varchar(30) | YES | | NULL | | | localUser | varchar(20) | YES | | NULL | | | localGroup | varchar(20) | YES | | NULL | | | jobName | varchar(30) | YES | | NULL | | | queueName | varchar(20) | YES | | NULL | | | jobCreateTime | varchar(10) | YES | | NULL | | | jobQueuedTime | varchar(10) | YES | | NULL | | | jobEligibleTime | varchar(10) | YES | | NULL | | | startTime | varchar(10) | YES | | NULL | | | endTime | varchar(10) | YES | | NULL | | | execHOST | varchar(30) | YES | | NULL | | | resource_List_cput | time | YES | | NULL | | | resource_List_neednodes | varchar(30) | YES | | NULL | | | sessionID | int(10) | YES | | NULL | | | exitStatus | int(2) | YES | | NULL | | | resources_Used_cput | time | YES | | NULL | | | resources_Used_mem | int(16) | YES | | NULL | | | resources_Used_vmem | int(16) | YES | | NULL | | | resources_Used_walltime | time | YES | | NULL | | +-------------------------+-------------+------+-----+---------+-------+