1 / 23

GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

World Data Center Climate: Status and Portal Integration. Michael Lautenschlager, Hannes Thiemann and Frank Toussaint ICSU World Data Center Climate Model and Data / Max-Planck-Institute for Meteorology Hamburg, Germany. GO-ESSP at LLNL Livermore, June 19th – 21st, 2006.

mrinal
Download Presentation

GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. World Data Center Climate: Status and Portal Integration Michael Lautenschlager, Hannes Thiemann and Frank Toussaint ICSU World Data Center Climate Model and Data / Max-Planck-Institute for Meteorology Hamburg, Germany GO-ESSP at LLNL Livermore, June 19th – 21st, 2006 WDCC Home: www.wdcc-climate.de / WDCC Contact: data@dkrz.de

  2. Content: WDCC Status CERA Concept Portal Integration

  3. IPCC WOCE GEBCO BALTEX HOAPS CEOP COSMOS CARIBIC EH5/MPI-OM IPCC-AR4 ERA15/40 NCEP Simulations @ MPI, GKSS,… WDCC Content June 2006: 590 Experiments / 79.000 Data Sets Data from Earth System Modelling and Related Observations ERA40 Start: Approved in January 2003 Maintenance: Model and Data (M&D/MPI-M) and German Climate Computing Centre (DKRZ)

  4. Data Export from WDC Climate Corresponds to 2 – 10 TB/month

  5. Geographical Distribution of WDCC Users Total number of registered users: 750 (Mai 2006)

  6. 6 * 10**9 BLOBs Data Import into WDC Climate ECHAM5/MPI-OM IPCC AR4 Scenarios (ca. 110 TB)

  7. (I) Data catalogue and Pointer to Unix files Enable search and identification of data Allow for data access as they are (coarse granularity raw data files) (II) Application-oriented data storage in BLOB tables Time series of individual variables are stored as BLOB entries in DB Tables (fine granularity data products) Allow for fast and selective data access Storage in standard data format (GRIB, NetCDF/CF) Allow for application of standard data processing routines (PINGOs, CDOs) CERA1) Concept: Semantic Data Management 1)Climate and Environmental data Retrieval and Archiving

  8. Experiment Description Pointer to Unix-Files Dataset 1 Description Dataset n Description BLOB Data Table BLOB Data Table WDCC Data Topology Level 1 - Interface: Metadata entries (XML, ASCII) + Data Files Level 2 – Interf.: Separate files containing BLOB table data in application adapted structure (time series of single variables) BLOB DB Table corresponds to scalable, virtual file at the operating system level.

  9. Contact Coverage Reference Entry Status Parameter Spatial Reference Distribution Local Adm. Data Org Data Access CERA Data Model

  10. Data matrix of model experiment Model variables Model Run Time Raw data file inDKRZ Archive 2 D: small BLOBS (180 KB) 3 D: large BLOBS (3 MB) Raw data file: direct model output (1.3 – 16.2 GB) Each columm is one BLOB Table in CERA-DB

  11. Climate Model Data Structures Preferred DB-storage structure for web-based access: • single variable • single level • time series of 2D gridded data records • Formats: GRIB-1 – NetCDF/CF (- GRIB-2) original data structure (4-D) Application related data structure (2-D)

  12. DKRZ Architecture TX7: Intel Itanium-2 with Linux

  13. Portal Integration Two strategies: One way integration: discovery and use metadata are integrated in a central data portal in one step Example: C3Grid data catalogue (refer to presentation from Heinrich Widmann) Two way integration: discovery metadata are integrated in central data portal, use metadata are extracted from remote archive when they are needed for data download and processing Example: Primary data publication in TIB library catalogue (STD-DOI) WDCC integration in NDG (NERC Data Grid)

  14. Primary data publication (STD-DOI) URL: http://www.std-doi.de/ Primary Data Publication Process Data Review ISO 690-2: Metadata for citation of electronic media

  15. Example: Publ.-DOI from WDCC

  16. DOI URN

  17. Publ.-DOI

  18. 830 GB

  19. Ident.-DOI Data retrieval procudure is given at the end (user identification is required)

  20. WDCC Metadaten und OAI-PMH O p e n A r c h i v e s I n i t i a t i v e Protocol for Metadata Harvesting

  21. WDCC OAI server at: • (Software: dlese (www.dlese.org) + apache-tomcat 5.5.12 + Java 1.5) • http://uranus.dkrz.de:8080/oai/provider • - 35 IPCC experiments with more than 11000 datasets • Metadata Format: ISO 19115 • C3Grid (http://gsphere.awi.de:8080/gridsphere/gridsphere) • - 40 STD-DOI experiments with more than 1700 datasets • Metadata Format: DIF • GO-ESSP (NDG, http://ndg.badc.rl.ac.uk/) Ü

  22. NDG OAI Harvesting (Pull or Notification) Ü DIF XMLs WDCC OAI Server WDCC (Software: dlese) OAI Client NDG (dlese) Catalog NDG record 1...n Discovery Portal NDG DIF XMLs Provider 2 OAI Server 2 Process OAI Server n Delivery

  23. URL: http://glue.badc.rl.ac.uk/discovery/ Keyword: ECHAM4

More Related