350 likes | 462 Views
A CHAIN-REDS Perspective about Data Access and Metadata Management. Rafael Mayo-García, CIEMAT. Tunis / 12-13 Dec 2013. A CHAIN-REDS Perspective about Data Access and Metadata Management
E N D
A CHAIN-REDS Perspective about Data Access and Metadata Management Rafael Mayo-García, CIEMAT Tunis / 12-13 Dec 2013
A CHAIN-REDS Perspective about Data Access and Metadata Management Roberto Barberaa,b, Carla Carrubbab, Giuseppina Inserrab, Christos Kanellopoulosc, Kostas Koumantarosc, Rafael Mayo-Garcíad, Ognjen Prnjatc, Rita Riccerib, Manuel Rodriguez Pascuald, Antonio Rubio-Monterod, Federico Ruggierie a University of Catania b INFN-Catania c GRNET d CIEMAT e GARR & INFN-Roma Tre
CHAIN-REDS: A legacy from CHAIN Coordination & HarmonisationofAdvancedeINfrastructures CHAIN
WP4 in CHAIN-REDS • CHAIN-REDS is an EC (306819) funded project • ~ 2.1 M€ • 1 December 2012 – 30 months • Structured in • WP 1 Project Management • WP 2 Dissemination, Training and Outreach • WP 3 Interoperation and coordination of e-Infrastructures • WP 4 Data Infrastructures • WP 5 Support to small groups and emerging communities
WP4 in CHAIN-REDS • CHAIN-REDS is an EC (306819) funded project • ~ 2.1 M€ • 1 December 2012 – 30 months • Structured in • WP 1 Project Management • WP 2 Dissemination, Training and Outreach • WP 3 Interoperation and coordination of e-Infrastructures • WP 4 Data Infrastructures • WP 5 Support to small groups and emerging communities
WP4 ‘Data infrastructures’ • Partners • INFN • CIEMAT • GRNET • CESNET • UBUNTUNET • CLARA • IHEP • ASREN • SIGMA ORIONIS • C-DAC
WP4 ‘Data infrastructures’ • Partners • INFN • CIEMAT • GRNET • CESNET • UBUNTUNET • CLARA • IHEP • ASREN • SIGMA ORIONIS • C-DAC Europe Europe
WP4 ‘Data infrastructures’ • INFN • CIEMAT • GRNET • CESNET • UBUNTUNET • CLARA • IHEP • ASREN • SIGMA ORIONIS • C-DAC Europe Africa Europe
WP4 ‘Data infrastructures’ • INFN • CIEMAT • GRNET • CESNET • UBUNTUNET • CLARA • IHEP • ASREN • SIGMA ORIONIS • C-DAC Africa Latin America
WP4 ‘Data infrastructures’ • INFN • CIEMAT • GRNET • CESNET • UBUNTUNET • CLARA • IHEP • ASREN • SIGMA ORIONIS • C-DAC Latin America Asia Asia
WP4 ‘Data infrastructures’ • INFN • CIEMAT • GRNET • CESNET • UBUNTUNET • CLARA • IHEP • ASREN • SIGMA ORIONIS • C-DAC Asia Middle East Asia
WP4 ‘Data infrastructures’ • INFN • CIEMAT • GRNET • CESNET • UBUNTUNET • CLARA • IHEP • ASREN • SIGMA ORIONIS • C-DAC Middle East
WP4 ‘Data infrastructures’ • Public outreach and dissemination is focused on reporting on Trans-continental Data Infrastructures and Data repositories and on several Use Cases • D4.1 Trans-continental Data Infrastructures and Data repositories • D4.2 Analysis of Data Infrastructures and Data repositories (coming soon) • Available at http://www.chain-project.eu/deliverables
WP4 ‘Data infrastructures’ • CHAIN-REDS has established official collaborations (MoUs) with other VRC-related communities • AgINFRA • DCH-RP • EarthServer • EIFL • ENGAGE
WP4 ‘Data infrastructures’ • Conversations are being held with EUDAT, H3Africa, iMENTORS, IVOA, SAEON, SKA Africa, Univ. Cape Town
Knowledge Base: Infrastructure • Extend the CHAIN-REDS Knowledge Base (BS) with Data capabilities http://www.chain-project.eu/knowledge-base • RREN(s) • NREN • NGI • CA(s) • Ident. Fed(s) • ROC(s) • Grid site(s) • Application(s)
Knowledge Base: Document & Data repositories • An investigation on the available (Open Access) Data and Document Repositories has been performed • Information has been collected in Africa, Asia, Europe, Latin America and the Middle East • New ones have been incorporated into the Knowledge Base • These new repositories range from databases owned by a single group to huge continental collaborations
Knowledge Base: Document & Data repositories • 3,200 repos • >33 M docs
Knowledge Base: Document & Data repositories
Standards • About Open Access Data Repositories, standards are being promoted • OAI-PMH for metadata retrieval • Dublin Core as metadata schema • SPARQL for semantic web search • VOTable (XML) as potential standard for the interchange of data represented as a set of tables • Persistent Identifiers (PID)
OADRs and DRs • The adopted standards have been implemented in the CHAIN-REDS KB • Developments on (Open Access) Document and Data Repositories • A semantic web enrichment • A semantic search engine
Semantic search engine architecture Linked-data search engine Semantic-web enrichment Harvester (running on grid/cloud) Harvester (running on grid/cloud) OAI-PMH OAI-PMH End-points Data Repos. OADRs
OADRs and DRs • The semantic search engine on CHAIN-REDS linked data is available • Allows searching among the semantically-enriched metadata coming from the OADRs and DRs included in the KB cell
OADRs and DRs New knowledge discovery!
Single and Parallel semantic search are available • Single: the usual semantic search service described before • Parallel: the new parallel semantic search service that allow users to search in parallel across the millions of resources contained in the CHAIN-REDS Knowledge Base and in the ENGAGE Platform • Parallel semantic search engines have been made available also in others Science Gateways • agINFRA (CHAIN-REDS Knowledge Base & OpenAgris repository) • DCH-RP (CHAIN-REDS Knowledge Base & Europeana, Cultura Italia and Isidore repositories) Semantic Search Engine
Semantic Search Engine • Performs sequential and parallel searches • ENGAGE • agINFRA • DCH-RP
Semantic Search Engine • A programmable use of the CHAIN-REDS Semantic Search Engine is also possible by means of a RESTful API • http://www.chain-project.eu/semantic-search-api • CHAIN-REDS webpage Semantic Search Web • Example • http://www.chain-project.eu/virtuoso/api/resources?keyword=<KEYWORD>&limit=<NUMBER_OF_RESOURCES >
Coming actions • Future developments on • A tool for extracting the data associated to OADRs • The execution of distributed jobs in the Science Gateway • Data Accessibility, Reproducibility and Trustworthiness (DART) • Based on the interoperability demo performed by CHAIN-REDS at EGI TF 2013 • Aiming at seamlessly perform the cycle • Access to a document Extraction of associated raw data Execution of a code taking those data as input Generation of new results Upload of the new results and article
Conclusions • CHAIN-REDS has identified in a first phase several fields with interests in the different regions • Agriculture • Cultural Heritage • e-Government • Earth Science • Astronomy and Astrophysics • Potential collaborations with initiatives and projects working on these areas are being carried out
Conclusions • Other fields and groups are also of interest • OADRs’ and DRs’ managers/owners are welcome to contact the project to share their data within the CHAIN Knowledge Base (both in Africa and Latin America this is already happening) • CHAIN-REDS is also looking forward to receiving feedbacks from all interested organizations on the Knowledge Base and the semantic search service
Conclusions • Data developments have been carried out in the Regions of interest to CHAIN-REDS • A special action in the Middle East is now a priority for CHAIN-REDS • Semantic engine and web-enrichment are powerful tools to link data and retrieve information DART
Thank you ! www.chain-project.eu proj-office@chain-project.eu – rafael.mayo@ciemat.es