420 likes | 519 Views
NERC DataGrid Status: ESP June 2004. Bryan Lawrence on behalf of the NDG, BADC and BODC. Ray Cramer, Marta Gutierrez, Kerstin Kleese, Siva Kondapalli, Sue Latham, Roy Lowry, Kevin O’Neill, Ag Stephens, Andrew Woolf. British Atmospheric Data Centre http://badc.nerc.ac.uk.
E N D
NERC DataGrid Status: ESP June 2004 Bryan Lawrence on behalf of the NDG, BADC and BODC. Ray Cramer, Marta Gutierrez, Kerstin Kleese, Siva Kondapalli, Sue Latham, Roy Lowry, Kevin O’Neill, Ag Stephens, Andrew Woolf British Atmospheric Data Centre http://badc.nerc.ac.uk
NDG Aims and Metadata Taxonomy (Review ) Demonstration of NDG in action (no grid services yet, but shape of things to come should be clear) “Stub-B” New Tool: DataExtractor Status Issues with metadata Chemistry data at BADC Numerical Simulation Discovery & back to Status Outline British Atmospheric Data Centre http://badc.nerc.ac.uk
Simulations Assimilation Complexity + Volume + Remote Access = Grid Challenge British Atmospheric Data Centre British Oceanographic Data Centre http://ndg.nerc.ac.uk British Atmospheric Data Centre http://badc.nerc.ac.uk
NDG Metadata Taxonomy British Atmospheric Data Centre http://badc.nerc.ac.uk
British Atmospheric Data Centre http://badc.nerc.ac.uk
NDG Metadata Architecture • Service based model: • clear separation between discovery and use • discovery service standards compliant and interoperable British Atmospheric Data Centre http://badc.nerc.ac.uk
Multiple Protocol Support will be built into the “NDG Vanilla Discovery Service” (D) - Discovery Open Archives Initiative – Digital Library Protocol for harvesting metadata. NDG Supports Multiple Discovery Services – “build your own” OAI OAI British Atmospheric Data Centre http://badc.nerc.ac.uk
NDG Structure British Atmospheric Data Centre http://badc.nerc.ac.uk
British Atmospheric Data Centre http://badc.nerc.ac.uk
British Atmospheric Data Centre http://badc.nerc.ac.uk
Can order responses by title or data centre (or default random) Choose to return either data or “B-”Metadata Flexible Information Return Look at DIFs in either HTML or XML British Atmospheric Data Centre http://badc.nerc.ac.uk
British Atmospheric Data Centre http://badc.nerc.ac.uk
British Atmospheric Data Centre http://badc.nerc.ac.uk
British Atmospheric Data Centre http://badc.nerc.ac.uk
British Atmospheric Data Centre http://badc.nerc.ac.uk
British Atmospheric Data Centre http://badc.nerc.ac.uk
British Atmospheric Data Centre http://badc.nerc.ac.uk
British Atmospheric Data Centre http://badc.nerc.ac.uk
British Atmospheric Data Centre http://badc.nerc.ac.uk
B metadata is a store of metadata intended to: Allow the production of the various “industry standard” discovery formats (DIF, DC, FGDC/GEO, 19115) Provide a more complete metadata store than that demanded by the usual discovery formats, leveraging the metadata holdings of the data centres Allow a smooth link across to the data browse and use elements of the NDG Expected to expand in importance as we can add more semantic detail to the schema Role of B metadata: domain ontology British Atmospheric Data Centre http://badc.nerc.ac.uk
B metadata – a simplified view British Atmospheric Data Centre http://badc.nerc.ac.uk
Core linking concept is the deployment How is the B metadata implemented? of a Data Production Tool at an Observation Station on behalf of an Activity that produces a Data Entity Activity DataProductionTool ObservationStation Links the metadata records into a structure that can be turned into a navigable XML using Xquery or XSLT with any of the record types as the root element. Deployment Each of the main metadata objects has security data attached to it. This means that this can be applied to queries on the metadata Data Entity British Atmospheric Data Centre http://badc.nerc.ac.uk
“B” metadata works well in databases, but what about: presentation “standalone generation of “D” storing metadata locally as files “Stub B” – what is it? • Given a raw B record for a Data Entity contains just: • the basic data entity details • a series of references to related records • no details such as: • activity name, • instrument name, • station “stub B” is the base entity expanded through its own related deployments and internal references British Atmospheric Data Centre http://badc.nerc.ac.uk
Makes application developers’ lives easier, especially in the presentation of search results Allow off-line storage of metadata by users Basis of D production via XSLT Hook into main B repositories Potential discovery format (while there are lots around already… this could allow more “discipline dependent” discovery) Role of Stub B British Atmospheric Data Centre http://badc.nerc.ac.uk
Discovery Metadata Usage British Atmospheric Data Centre http://badc.nerc.ac.uk
British Atmospheric Data Centre http://badc.nerc.ac.uk
British Atmospheric Data Centre http://badc.nerc.ac.uk
British Atmospheric Data Centre http://badc.nerc.ac.uk
British Atmospheric Data Centre http://badc.nerc.ac.uk
British Atmospheric Data Centre http://badc.nerc.ac.uk
Background activity being parallelised with GODIVA/CCLRC e-science collaboration (spectral -> gridpoint + CDMS + visualisation tools) Download either plot or the data that went into the plot. British Atmospheric Data Centre http://badc.nerc.ac.uk
Major effort on defining feature types for observation types so we can build an OGC/ISO compatible data extractor for observations and numerical data. Main thrust for Andrew Woolf and 0.5 New FTE Ag Stephens contributing when time available Security Infrastructure Development Collaboration with CCLRC e-science, ECOGrid and 0.5 FTE Ongoing work on metadata definition and population: Oceanographic data Siva Kondapalli Chemistry data Main thrust for Sue Latham Numerical Modelling data DIF numerical definition (moving to ISO), BADC and UK Community Katherine Bouton’s work at NCAS/CGAM Remote Sensing Data Collaboration with NEODC and PML Ongoing work on databases and interfaces, DIF to ISO and “B” Kevin O’Neill and Marta Gutierrez Where are we? British Atmospheric Data Centre http://badc.nerc.ac.uk
Role-based access: <dataset> <host> badc.nerc.ac.uk </host> <name>ukmo-obs </name> <access-requires> researcher <access-requires> <access-requires> ukmo-obs </access-requires> <processing-requires> nerc </processing-requires> </dataset> Key concept: Only hosts that trust each other share data, even within a larger virtual organisation: e.g. at BADC: <trusted> <bodc> <host>ndg.bodc.nerc.ac.uk</host> <attribute remotename=”nerc”> nerc </attribute> <attribute remotename=”ashoe”> ashoe </attribute> <attribute remotename=”staff”> nerc </attribute> <other> bodc </other> </bodc> </trusted> Authorisation Signed “conditions of use” form exists for this dataset British Atmospheric Data Centre http://badc.nerc.ac.uk
NDG Security Certificate based, pass encrypted credentials between user and gatekeeper. British Atmospheric Data Centre http://badc.nerc.ac.uk
Extending the CF convention for chemistry … grep -i sulphate vars2.csv "Allen, Andrew and Grenfell, Lee ", Sulphate / coarse (ug/m3) "Allen, Andrew and Grenfell, Lee ", Sulphate / fine (ug/m3) "Bradbury, Carl ", SULPHATE LOADING (ug/m3) "James, Jonathan And Allen, Andrew ", Sulphate / coarse (ug/m3) "James, Jonathan And Allen, Andrew ", Sulphate / fine (ug/m3) "James, Jonathan And Allen, Andrew ", Sulphate / fine+coarse (ug/m3) "James, Jonathan And Allen, Andrew ", Sulphate / fine+coarse (ug/m3) "McArdle, Nicola and Thompson, Adrian ", sulphate (nmol m-3) "McArdle, Nicola and Thompson, Adrian ", sulphate (µM) "McArdle, Nicola and Thompson, Adrian ", sulphate <1.1 µm diameter (nmol m-3) "McArdle, Nicola and Thompson, Adrian ", sulphate <1.2 µm diameter (nmol m-3) "McArdle, Nicola and Thompson, Adrian ", sulphate <1µm diameter (nmol m-3) "McArdle, Nicola and Thompson, Adrian ", sulphate > 1µm diameter (nmol-3) "McArdle, Nicola and Thompson, Adrian ", sulphate >1.1 µm diameter (nmol m-3) "McArdle, Nicola and Thompson, Adrian ", sulphate >1.2 µm diameter (nmol m-3) "McArdle, Nicola and Thompson, Adrian ", sulphate bulk (nmol m-3) "McArdle, Nicola and Thompson, Adrian ", sulphate bulk (nmol m-3) "McFadyen, Gordon ", Sulphate "Robertson, Leonie and Davison, Brian ", Coarse sulphate concentration (ug m-3) "Robertson, Leonie and Davison, Brian ", Fine sulphate concentration (ug m-3) • Currently 35,000 Ames format files, mostly Atmospheric Chemistry … • Real problems with vocabulary, and units … • Spinning up a new project … • … need community help! grep -i butane vars2.csv … i-Butane (ppt) … iso-Butane (ppt) … n-Butane (ppt) … iso-Butane pptv … ISO-BUTANE (pptv) … i-,n-butane ACSOE just one of many datasets with this problem … British Atmospheric Data Centre http://badc.nerc.ac.uk
DRAFT DIF Component (1) • Key New Groups: • Numerical Model • ID Information • Numerical Model Components • (from) Atmosphere, Ocean-Dynamic, Ocean-Thermodynamic, Cryosphere, Land-Surface with possible appends: Chemistry, 4D-VAR, 3D-VAR, QG • … details for each • Numerical Simulation • ID Information • Initial Condition Information • … details • Forcing Information • … details British Atmospheric Data Centre http://badc.nerc.ac.uk
Group: Numerical_Model Model_Name:HadCM3 Model_Calendar: 360 day Group: Model_Component Model_Component_type: Atmosphere Model_Component_Resolution: 2.5 degrees latitude, 3.75 degrees longitude Group: Model_Component_VerticalDomain VerticalDomain_Top: 1000 hPa VerticalDomain_Bottom: 4 hPa End_Group Model_Component_Timestep: 0.5 hours Group: Model_Component_Summary Cullen et al Atmospheric Model adapted for climate use. End_Group End_Group Example Group: Model_Component Model_Component_type: Ocean Model_Component_Resolution: 1 degrees latitude, 1 degrees longitude Group: Model_Component_VerticalDomain VerticalDomain_Top: 0m VerticalDomain_Bottom: 6000m End_Group Model_Component_Timestep: 0.5 hours Group: Model_Component_Summary Bryan and Cox Ocean model End_Group End_Group URL: http://www.metoffice.com/hadcm3 (e.g.) End_Group DRAFT DIF Component (2) *=required +=repeatable Group: Numerical_Model Model_Name: Model_Version: * Model_Calendar: [model calendar valid] - eg CF calendar or ISO *+ Group: Model_Component * Model_Component_type: [Model Component Valid] Model_Component_Resolution: Group: Model_Component_VerticalDomain VerticalDomain_Top: VerticalDomain_Bottom: End_Group Model_Component_Timestep: Group: Model_Component_Summary [Multiple text lines allowed] End_Group End_Group URL: End_Group British Atmospheric Data Centre http://badc.nerc.ac.uk
Group: Numerical_Simulation Numerical_Simulation_Name: All Forcings Numerical_Simulation_ID: format_tbd_but_using_purl Group: run_period start_date: 1859-12-01-00 end_date: 1999-11-30-00 real_date: yes End_Group Group Initial_Condition: Ensemble: 4 Ensemble Parent: 0 Summary: initial conditions taken from the HadCM3 control integration End_Group Group Forcing: Summary: Volcanic forcing from Sato et al End_Group Group Forcing: Summary: Solar Forcing from Lean et al End_Group Group Forcing: Summary: CO2 from... End_Group Group Forcing: Summary: Anthropogenic SO2 from... End_Group End_Group Group: Numerical_Simulation Numerical_Simulation_Name: Ensemble Member of blah Group: run_period start_date: 1859-12-01-00 end_date: 1999-11-30-00 real_date: yes End_Group Group Initial_Condition: Ensemble: 1 Ensemble Parent: format_tbd_but_using_purl Summary: initial conditions taken from the HadCM3 control integration End_Group Group Forcing: Summary: Volcanic forcing from Sato et al End_Group Group Forcing: Summary: Solar Forcing from Lean et al End_Group Group Forcing: Summary: CO2 from... End_Group Group Forcing: Summary: Anthropogenic SO2 from... End_Group End_Group Example Example Ensemble Member Draft DIF Components (3) *=required +=repeatable Group: Numerical_Simulation Numerical_Simulation_Name: * Numerical_Simulation_ID: [recorded using PURI] * Group: run_period * start_date: [yyyy-mm-dd-hh] * end_date: [yyyy-mm-dd-hh] * real_date: [yes,no] End_Group * Group Initial_Condition: Ensemble: [Numeric Value] Summary: End_Group *+ Group Forcing: Ensemble: [Numeric Value] Ensemble Parent : [uri or 0] Summary: End_Group End_Group British Atmospheric Data Centre http://badc.nerc.ac.uk
Major effort on defining feature types for observation types so we can build an OGC/ISO compatible data extractor for observations and numerical data. Main thrust for Andrew Woolf and 0.5 New FTE Ag Stephens contributing when time available Security Infrastructure Development Collaboration with CCLRC e-science, ECOGrid and 0.5 FTE Ongoing work on metadata definition and population: Oceanographic data Siva Kondapalli Chemistry data Main thrust for Sue Latham Numerical Modelling data DIF numerical definition (moving to ISO), BADC and UK Community Katherine Bouton’s work at NCAS/CGAM Remote Sensing Data Collaboration with NEODC and PML Ongoing work on databases and interfaces, DIF to ISO and “B” Kevin O’Neill and Marta Gutierrez Where are we? British Atmospheric Data Centre http://badc.nerc.ac.uk
(B) Metadata Model British Atmospheric Data Centre http://badc.nerc.ac.uk
(B) Metadata Model Overview GIS/ISO Feature Types British Atmospheric Data Centre http://badc.nerc.ac.uk
Dataset Rich spatiotemporal referencing (standards-compliant: ISO19108, ISO19111) Multidimensional array Variables ... or from aggregated storage ... of other arrays (A) NDG Semantic Data Model British Atmospheric Data Centre http://badc.nerc.ac.uk