220 likes | 231 Views
Learn about R-GMA, a relational implementation for information and monitoring, with a mediator for query optimization and archiving capabilities.
E N D
R-GMA for real12/6/2002 Steve Fisher / RAL <s.m.fisher@rl.ac.uk>
Summary • Brief summary of R-GMA • R-GMA in release 1.3 • R-GMA in L&B • R-GMA in WP7 – wait for next Tuesday • OGSIfication of R-GMA • Details of release 1.3 Please feel free to interrupt R-GMA for real
Producer subscribe Registry Consumer lookup R-GMA • Use the GMA from GGF • A relational implementation • Applied to both information and monitoring • Creates impression that you have one RDBMS per VO R-GMA for real
Relational Approach • Not a general distributed RDBMS system, but a way to use the relational model in a distributed environment where ACID properties are not generally important. • Producers announce: SQL “CREATE TABLE” publish: SQL “INSERT” • Consumers collect: SQL “SELECT” R-GMA for real
Application Code Consumer Servlet Consumer API Registry API Registry Servlet Schema API Schema Servlet Producer API Registry API Sensor Code ProducerServlet R-GMA • API – Servlet communication • http(s) in • XML back R-GMA for real
Schema & Contributions R-GMA for real
Contributions are Views SELECT * FROM cpuLoad WHERE country = ’UK’ AND site = ’RAL’ SELECT * FROM cpuLoad WHERE country = ’UK’ AND site = ’GLA’ R-GMA for real
The mediator Sa relational schema (for a virtual database) qqueries posed againstS pProducers, associated with views onS. Currently views have the form: SELECT * FROM r WHERE < ??? > • The Mediator • how to match q with the p’s ? • It is the mediator which makes R-GMA work • It is hidden inside the ConsumerServlet R-GMA for real
Mediator in 1.3 • Recall that you can view R-GMA as a huge relational database of all the information produced. • Producers register which partition they have by means of a partitioning predicate • Some data: • may not be accessible • may not be kept • may be duplicated • The mediator is hidden inside the Consumer • but is an essential part of R-GMA • The final mediator will take “any” SQL statement and from the information in the registry find the right producers • We can now merge information from several producers R-GMA for real
Not just one Producer • CircularBufferProducer • Fast • Uses an SQL parser – no RDBMs involved • Information will be lost if not consumed in time • Streaming well defined • 1 write pointer for buffer and 1 read pointer for each Consumer • DataBaseProducer • Slower • Information not lost • Clean up strategy needed • Streaming needs to be defined • Archiver • Just Consumers and a DataBaseProducer R-GMA for real
GIIS GRIS GRIS GRIS InfoProvider InfoProvider InfoProvider Release 1.2 RB R-GMA for real
RB GOUT Archivers and other R-GMA components GIN GIN GIN InfoProvider InfoProvider InfoProvider Release 1.3 Multi-valued attributes make it not totally trivial Archiver’s DB in GOUT cleaned by ad-hoc code R-GMA for real
GOUT Archiver Consumer (CE) Consumer (SE) Consumer LDAP DataBaseProducer RDBMS Consumer (..) Clean up R-GMA for real
Each Bookkeeping Server publishes to a CircularBufferProducer Archiver has a where clause to collect jobs belonging to a subset of users Most queries will be satisfied by one archiver BUT… CircularBufferProducer may lose information Need something fast, capable of streaming, but which keeps data safe The consumer part of the Archiver will not notice new BS has appeared Need to also register Consumers Need to deal with “active” jobs Implies a triggering mechanism Only want latest info from a job Will provide an overwrite mode Archiver of A-H “CircularBufferProducer” BS Archiver of I-N “CircularBufferProducer” BS Archiver of O-Z L&B – first go R-GMA for real
Application Code Consumer Servlet Consumer API Registry API Registry Servlet Schema API Schema Servlet Producer API Registry API Sensor Code ProducerServlet R-GMA - OGSIfication • API – Servlet communication • http(s) in • XML back R-GMA for real
Step 1 - Isolate Servlets Application Code • API – Servlet communication • http(s) in • XML back Consumer Instance Consumer API Registry API Registry Schema API Producer API Registry API Schema Producer Instance Sensor Code R-GMA for real
Step 2 - Web Services Consumer “Factory” Application • API – derived from WSDL • Web Services • WSDL, SOAP • Issues • context to access instance • HTTP Streaming Consumer Instance Consumer API PortTypes PortTypes Registry PortTypes PortTypes PortTypes Producer API PortTypes Producer Instance Schema Sensor Producer “Factory” R-GMA for real
Registry Step 3 - OGSA Consumer Factory Application • All Grid Services • OGSA Factories, GSH, GSR • Registry includes HandleMapper • SQL as Service Data Element Query Language • lightweight api causes issues with lifetime management • TerminationInterval then instance loopback to setTerminationTime. Consumer API Consumer Instance Producer API Producer Instance Schema Sensor Producer Factory R-GMA for real
Other OGSIfication issues • Consider XML as internal representation of service data elements • Depends on other developments • Consider Xquery as service data elements query language • Depends on how Xquery develops • Security • Authorisation looks hard • Registry discovery • same issues as OGSA in general R-GMA for real
1.3 from User Perspective • Existing • Java API and Partial C++ API • New • Full(?) C++ API • Partial C, Python and Perl APIs • (GIN is currently in Perl) • C based on libwww style of OO C • Perl and Python derived via SWIG • Probably not a good idea • Not like hand generated code – so may do it again • Easy installation and configuration • For developers • Installers • Users • Pulse • GUI to browse (and manipulate) R-GMA data • Improved mediator R-GMA for real
Release 1.3 Internals • GRRP-like soft state registration • Uniform exception handling • To ensure that useful messages and stack traces are preserved. • This includes communication between Servlet and API R-GMA for real
Future Work • OGSIfication • Consider how to handle time better in queries • GRM/PROVE integration • Security • Will be based on WP2 (Spitfire) • Will need to do our own authorisation work • Replication • We need to be able to distribute the schema and the registry • For performance • For reliability R-GMA for real