180 likes | 299 Views
WP2: Data Management. Gavin McCance University of Glasgow November 5, 2001. Overview. Deliverables Replication: GDMP Meta-data: Spitfire GridPP effort Future work Query Optimisation. Deliverables. EU DataGrid WP2: Major M9 deliverables met GDMP delivered Spitfire delivered
E N D
WP2: Data Management Gavin McCance University of Glasgow November 5, 2001
Overview • Deliverables • Replication: GDMP • Meta-data: Spitfire • GridPP effort • Future work • Query Optimisation
Deliverables • EU DataGrid WP2: Major M9 deliverables met • GDMP delivered • Spitfire delivered • Architecture Document • http://www.cern.ch/grid-data-management
GDMP http://cmsdoc.cern.ch/cms/grid/ • Generic mirroring tool for any file type (read only replica) • Particular plug-ins for Objectivity database files • Subscription model for automatic synchronisation of files • Automatic update of replica catalogue • Currently uses Globus Replica Catalogue
…GDMP • BrokerInfo API from WP1 • Allows users of GDMP to obtain information from the job scheduler • Mass Storage Interface from WP5 • e.g. Support for file staging • Security is provided via standard GSI (single sign-on) • Authorisation via grid mapfile • File transfer made using GridFTP • Installation: RPM and tarball
…GDMP usage Site A Site B • A,B) Start GDMP services (inetd) • B) Registers itself with site A • gdmp_host_subscribe • A) New files Register them • gdmp_register_local_file <path-to-file> • This updates the local (on A) catalogue • A) Tell the world (well..all subscribed sites) • gdmp_publish_catalogue • Will update the import catalogue on all subscribed sites
…GDMP usage Site A Site B • B) Get the new files from site A • gdmp_replicate_get • The new files will be transferred from site A site B • Globus replica catalogue updated • Filters so you only get files you want • CRC checking of file transfer
Spitfire http://hep-proj-spitfire.web.cern.ch/hep-proj-spitfire/ • Provides grid enabled access to any relational database • SQL Database Service • Storage of general meta-data • Service Index soon… • Secure access via GSI (single sign-on) • Installation: RPM and tarball
…Spitfire JAVA Servlet based • Allows any HTTP compliant system e.g. Web-browsers / standard C++ HTTP libraries to access any relational database across the grid… + + Oracle PostgreSQL Grid Security Standard communication protocols (XML over HTTPS) = SQL Database Service (Spitfire)
…Spitfire security • Authentication is currently provided • Standard user & server grid certificates • For both application programs and web browsers • Authorisation matrix coming soon… • Will map grid identity to ‘role(s)’ • Reader, info-update, manager • ‘Roles’ will then map to a given database connection with given permissions on a database • Eg. query-only, insert, update, create new tables
…Spitfire • Easy to install • Good documentation • Ready to run examples • For grid-based meta-data catalogue needs.. • … we need feedback!
WP2 GridPP Effort • Based at Glasgow • Effort will focus on primarily the query optimisation task of WP2 • 1 PhD student, 1.5 RA • Continuing effort in development of Spitfire and related applications • 0.7 RA
Future Spitfire work • Look at common ground between WP2 and WP3 • Spitfire and R-GMA? • Security • Authorisation mechanisms • Other spitfire applications • Service Index, Replica Catalogue • Work on scaleable architectures • Common with e.g. replica catalogue work
Query Optimisation work • Categorise possible areas for optimisation: • User oriented: high performance • Minimising cost for specific job • Grid oriented: high throughput • Maximise efficient usage of resources • Site oriented: local policy • Respond to specific site policies / requirements • Much preliminary work done! • Workshop in December 2001…
…Query Optimisation • Short term: • Data Access optimisation • Replica Optimiser component • How long will it take to get the data here? • Developing and evaluating appropriate algorithms for working this out and choosing best replica…
…Query Optimisation • Modelling and Simulation • Best not to test out the more crazy algorithms on the experiment testbed • Work underway with MONARC tool • Evaluating suitability as simulation tool for this particular work • Integrate into the QO work
Summary • Major deliverables for M9 met • GDMP and Spitfire • GridPP will concentrate effort on Query Optimisation task of WP2 • + continued Spitfire development • Work already underway