WDCC Data Infrastructure

WDCC Data Infrastructure The CERA database system Hannes Thiemann

What is CERA CERA Hierarchy Interfaces Find data Download data Import data Contents

CERA = Climate and Environmental Retrieval and Archive CERA is a database based Catalogue Archive system for climate research related data. What is CERA?

CERA is the the technical backbone of WDCC Is part of the HLRE infrastructure at DKRZ Linked to DKRZ file archive Uses DKRZ infrastructure CERA and WDCC

Data catalogue CERA Hierarchy

Contact Coverage Reference Status Parameter Spatial Reference Distribution Local Adm. Data Org Data Access Entry

Data catalogue – “Yellow pages” Pointer to primary data CERA Hierarchy

Document the location of model raw data To allow users to locate model raw data quickly Provide consistency between CERA and the model raw data CERA Pointer

Data catalogue – “Yellow pages” Pointer to primary data Storage of application oriented pre-processed data CERA Hierarchy

CERA offers the storage of application oriented, pre-processed data Could be either the storage of Aggregated data like monthly means Range of variables in full resolution (1h/6h-values) Can be only done for the most relevant production runs CERA Data Storage

Climate model data IPCC (Hamburg model runs) IPCC DDC CEOP Observational Data ERA15/40 (ECMWF), NCEP 40Y WOCE Project support HOAPS CARIBIC BALTEX SFB512 Different model applications Size of CERA: 140 Tbyte Number of experiments: 382 Number of datasets: 51000 No. of blobs:4 billion Downloads last 12 months>330.000 Data Content

Graphical User Interface http://cera-www.dkrz.de Batch Interface http://cera-www.dkrz.de/CERA/jblob User Interface- extract data -

Selection via CERA meta data: • selection of the project • selection of the experiment (=model run) • display of meta data: experiment, quality, datasets • selection of the dataset • display of dataset information • add datasets to “process list” • download to client

Java application to allow for batch download of datasets jblob -datasetname name[ -options ] jblob –showdatasets "search string"( use '%' as wildcard ) jblob –help Example: jblob -datasetname EH4OPYC_SRES_A2_WIND10 -dir /tmp See hands on session tomorrow Batch Interface- jblob -

Metadata 1 2 3 6 4b 5 Blobdata 4a 4 Technical Implementation

Metadata Makes use of XML input http://cera-www.dkrz.de/Meta_Fill/ User Interface- import data -

<?xml version="1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE EXPERIMENT_METADATA (View Source for full doctype...)>-<EXPERIMENT_METADATA> <ENTRY> <ENTRY_ID>FETCH NEW ENTRY_ID or ask for it (data@dkrz.de) </ENTRY_ID> <ENTRY_NAME>max.160 characters</ENTRY_NAME> <ENTRY_ACRONYM>max_30_characters</ENTRY_ACRONYM>-<ENTRY_TYPE> <ENTRY_TYPE>experiment(select from LOV)</ENTRY_TYPE> </ENTRY_TYPE>- <SUMMARY> <SUMMARY>Please insert the summary of your Experiment.(free text)</SUMMARY> </SUMMARY>- <QUALITY> XML Example

Data Use of specific tools maintained by M&D Data has to prepared by the data producer User Interface- import data -

data@dkrz.de Contact

WDCC Data Infrastructure

WDCC Data Infrastructure

Presentation Transcript

Spatial Data Infrastructure

Data Infrastructure Subcommittee

SIM- Data Infrastructure Subcommittee

SIM- Data Infrastructure Subcommittee

Data analysis infrastructure

Data Management Infrastructure

Introduction to WDCC / CERA Database

CERA / WDCC

SIM- Data Infrastructure Subcommittee

Data Centre Infrastructure

Publication and Citation of Scientific Primary Data at WDC Climate (WDCC )

Data Center Infrastructure

SIM- Data Infrastructure Subcommittee

Global Spatial Data Infrastructure

Data Center Infrastructure