1 / 17

WP2: Data Management

WP2: Data Management. Gavin McCance University of Glasgow November 5, 2001. Overview. Deliverables Replication: GDMP Meta-data: Spitfire GridPP effort Future work Query Optimisation. Deliverables. EU DataGrid WP2: Major M9 deliverables met GDMP delivered Spitfire delivered

tocho
Download Presentation

WP2: Data Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. WP2: Data Management Gavin McCance University of Glasgow November 5, 2001

  2. Overview • Deliverables • Replication: GDMP • Meta-data: Spitfire • GridPP effort • Future work • Query Optimisation

  3. Deliverables • EU DataGrid WP2: Major M9 deliverables met • GDMP delivered • Spitfire delivered • Architecture Document • http://www.cern.ch/grid-data-management

  4. GDMP http://cmsdoc.cern.ch/cms/grid/ • Generic mirroring tool for any file type (read only replica) • Particular plug-ins for Objectivity database files • Subscription model for automatic synchronisation of files • Automatic update of replica catalogue • Currently uses Globus Replica Catalogue

  5. …GDMP • BrokerInfo API from WP1 • Allows users of GDMP to obtain information from the job scheduler • Mass Storage Interface from WP5 • e.g. Support for file staging • Security is provided via standard GSI (single sign-on) • Authorisation via grid mapfile • File transfer made using GridFTP • Installation: RPM and tarball

  6. …GDMP usage Site A Site B • A,B) Start GDMP services (inetd) • B) Registers itself with site A • gdmp_host_subscribe • A) New files Register them • gdmp_register_local_file <path-to-file> • This updates the local (on A) catalogue • A) Tell the world (well..all subscribed sites) • gdmp_publish_catalogue • Will update the import catalogue on all subscribed sites

  7. …GDMP usage Site A Site B • B) Get the new files from site A • gdmp_replicate_get • The new files will be transferred from site A  site B • Globus replica catalogue updated • Filters so you only get files you want • CRC checking of file transfer

  8. Spitfire http://hep-proj-spitfire.web.cern.ch/hep-proj-spitfire/ • Provides grid enabled access to any relational database • SQL Database Service • Storage of general meta-data • Service Index soon… • Secure access via GSI (single sign-on) • Installation: RPM and tarball

  9. …Spitfire JAVA Servlet based • Allows any HTTP compliant system e.g. Web-browsers / standard C++ HTTP libraries to access any relational database across the grid… + + Oracle PostgreSQL Grid Security Standard communication protocols (XML over HTTPS) = SQL Database Service (Spitfire)

  10. …Spitfire security • Authentication is currently provided • Standard user & server grid certificates • For both application programs and web browsers • Authorisation matrix coming soon… • Will map grid identity to ‘role(s)’ • Reader, info-update, manager • ‘Roles’ will then map to a given database connection with given permissions on a database • Eg. query-only, insert, update, create new tables

  11. …Spitfire • Easy to install • Good documentation • Ready to run examples • For grid-based meta-data catalogue needs.. • … we need feedback!

  12. WP2 GridPP Effort • Based at Glasgow • Effort will focus on primarily the query optimisation task of WP2 • 1 PhD student, 1.5 RA • Continuing effort in development of Spitfire and related applications • 0.7 RA

  13. Future Spitfire work • Look at common ground between WP2 and WP3 • Spitfire and R-GMA? • Security • Authorisation mechanisms • Other spitfire applications • Service Index, Replica Catalogue • Work on scaleable architectures • Common with e.g. replica catalogue work

  14. Query Optimisation work • Categorise possible areas for optimisation: • User oriented: high performance • Minimising cost for specific job • Grid oriented: high throughput • Maximise efficient usage of resources • Site oriented: local policy • Respond to specific site policies / requirements • Much preliminary work done! • Workshop in December 2001…

  15. …Query Optimisation • Short term: • Data Access optimisation • Replica Optimiser component • How long will it take to get the data here? • Developing and evaluating appropriate algorithms for working this out and choosing best replica…

  16. …Query Optimisation • Modelling and Simulation • Best not to test out the more crazy algorithms on the experiment testbed • Work underway with MONARC tool • Evaluating suitability as simulation tool for this particular work • Integrate into the QO work

  17. Summary • Major deliverables for M9 met • GDMP and Spitfire • GridPP will concentrate effort on Query Optimisation task of WP2 • + continued Spitfire development • Work already underway

More Related