110 likes | 249 Views
Data Catalogue Service. Work Package 4. Main Objective: Deployment, Operation and Evaluation of a cataloguing service for scientific data. Why: Potential benefits beyond the convenience of powerful data searching/retrieving. Outcomes
E N D
Data Catalogue Service Work Package 4
Main Objective: • Deployment, Operation and Evaluation of a cataloguing service for scientific data. Why: • Potential benefits beyond the convenience of powerful data searching/retrieving. WP4
Outcomes • develop the generic software infrastructure to support the interoperation of facility data catalogues, • deploy this software to establish a federated catalogue of data across the partners, • provide data services based upon this generic framework which will enable users to deposit, search, visualise, and analyse data across the partners’data repositories, • evaluation of the service (also from the perspective of facility users) • manage jointly the evolution of this software and the services based upon it, • promote the take up of this technology and the services based upon it beyond the project. WP4
Relations and dependencies • user AAA services (WP3) • Virtual Laboratories (WP5) • Requires an established shared user AAA service • underpin the integrated data catalogue both of these are required to enable seamless access to the content through the virtual laboratories. WP4
Methodology Builds on: • PaNdata Support Action • user AAA services • in order to provide: service to the virtual labs No intention for a new metadata catalogue STFC’s ICATis an advanced implementation • Deployed in various facilities including Elettra/NFFA (+VCR) Comparison with other systems will be necessary • MCA, MCAT, Artemis and Fireman. (outdated candidates?) • Check: AMGA (Fireman replacement in GLite) WP4
The current system will need further development. Issues that have to be addressed: • how to linklogical files (indexed by metadata) to physical files • how to querymetadata • how to authorizeuser access to metadata (WP3 feedback?) • what APIto propose to programs to access metadata and data • (ICAT API at the catalogue level - pHDF5/ NeXus, Common Data Model? For the actual data in, line with PaNdata) WP4
Additional Should we “migrate” old files / archived datasets too? (converters?) Initial requirement Set of keywords for the metadata catalogue Expansion based on existing implementations + PaNdata SA Integration WP outcome + Dublin Core? WP4
Populating the catalogue • virtual laboratories (WP5) – demonstration & test • Existing data archives of other partners • May require converters + metadata generation • Distributed access • accessing data distributed over multiple sites via their metadata • performance and scalability will be evaluated (as elaborated in WP5) WP4
Task 4.1 • Survey existing systems • ICAT and other • Examine them against the metadata, authorisation, performance, and ontological requirements of vLab (WP5) and uCAT AAA (WP3) Task 4.2. • Deployment of the chosen metadata catalogue solution (=ICAT) Task 4.3. • Remote API access to the individual catalogues • Single search capability across the collaborating facilities. Task 4.4. • Benchmarking - evaluation of the performance. WP4
Indicators of success • Searchable data catalogue established in participating facilities (more than 50% uptake) • Cross facility searching in place for data from different facilities. WP4
Deliverables • D4.1. Requirements analysis for common data catalogue (M9: June 2012) • D4.2. Populated metadata catalogue with data from the virtual laboratories (M15: Dec. 2012) • D4.3 : Deployment of cross-facilitymetadata searching(M21: June 2013) • D4.4. Benchmarkof performance of the metadata catalogue (M27: Dec. 2013) WP4