1 / 15

Heinrich Widmann and Stephan Kindermann

Data Discovery and Basic Processing within the German Collaborative Climate Community Data and Processing Grid (C3Grid) Project. Heinrich Widmann and Stephan Kindermann Model and Data / DKRZ / Max-Planck-Institute for Meteorology Hamburg, Germany. GO-ESSP at LLNL

shirin
Download Presentation

Heinrich Widmann and Stephan Kindermann

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Discovery and Basic Processing within the German Collaborative Climate Community Data and Processing Grid (C3Grid) Project Heinrich Widmann and Stephan Kindermann Model and Data / DKRZ / Max-Planck-Institute for Meteorology Hamburg, Germany GO-ESSP at LLNL Livermore, June 19th – 21st, 2006 C3Grid Home: www.c3grid.de

  2. Overview • C3Grid Background • Data Analysis Workflows • C3Grid Architecture and Interfaces • Data Discovery and Metadata in C3-Grid • Data Information Service with Lucene • Data Access and Preprocessing • Summary

  3. C3Grid Background • C3Grid • Status : month 10 of 36 (phase 1) • is the earth system science community grid within the German D-Grid initiative • D-Grid includes five further community grid projects (AstroGrid, HEP-Grid, InGrid, MediGrid, TextGrid) • is a community driven grid • Goal is to develop a grid infrastructure appropriate for typical climate analysis workflows • Stepwise introduction and integration

  4. C3Grid Data Analysis Workflow Requirements Grid technologies ISO19115 / ISO19139 OAI-PMH + Lucene community webservice Shibboleth Globus Toolkit 4 WS-GRAM Requirements Metadata Discovery Data access (+ preprocessing) Security Scheduling Complex processing

  5. C3Grid Architecture and Interfaces Data Access and Basic Processing Data Discovery

  6. C3Grid Data Discovery and Data Access C3 Metadata catalog Portal ISO 19115 / 19139 OAI harvester Discovery - Discovery - Workflow composition Use Data request OAI-PMH Scheduling  Data Management Service Grid Infrastructure Metadata Data Access Web Service resource provider Web server / OAI provider Prop. Xml Prop. Rel. job submission • oids • time/space constraints • processing constraints preprocessing DB Files WS-GRAM World Data Centers (Climate,Mare,RSAT), DWD PIK, IFM-Geomar,.. analysis job data data data workspace workspace workspace workspace

  7. gridded data Data Items: <MD_Metadata http://www.isotc211.org/xxx"> <fileIdentifier ../> <resourceConstraints ../> <extent … spatial+temporal bounding box .. /> <contentInfo ..> <attributeDescription ../> <distributionInfo ..> <DS_Series> <composed_of> <composed_of> </MD_Metadata> Metadata Metadata Metadata Database “implicit” Metadata • Raw Experiment Data • 3D multi variable • files • Postprocessed • Experiment Data • 2D single variable • time series <MD_Metadata …. > Post-processing <MD_Metadata …. > Archive Database C3 ISO 19139 Metadata “Profile”

  8. C3Grid Data Information Service with Lucene inverted index Portal Webserver Apache Axis + Servlet Container Web service frontend indexing of selected fields full-text index DIS Apache Lucene harvesting backend <MD_Metadata>...</MD_Metadata> <MD_Metadata>...</MD_Metadata> <MD_Metadata>...</MD_Metadata> <MD_Metadata>...</MD_Metadata> OAI-PMH Archiv Pangaea CERA cache for ISO19139 documents [T. Langhammber, ZIB, Berlin]

  9. C3Grid Portal – Simple search

  10. C3Grid Portal – Advanced search

  11. C3Grid Data Access and Preprocessing • Data access interface • Community-specific webservice (WSDL) • Solutions of the individual institutes will be adapted to support the webservice • e.g. triggering of local data processing tools • Support data base and file based storage types • More detailed use metadata will be provided during the extraction process with the data

  12. C3Grid Data Access/Preprocessing Interface data data data CF standard names  Local variable names Stage file webservice request contains : • ObjectList of OIDs requested • CFList of standard names • Space constraints • Time constraints • Target directory • File format, e.g. netCDF or grib • … Constraints  necessary processing SOAP-XML StageFile Request Files Data Access Web service DB Access CDO processing

  13. Summary • Grid development is application driven • Discovery is based on • ISO 19115/19139 based metadata catalog • Hierarchical, two-leveled metadata scheme • Text based search in the catalog • Data access is implemented by • Proprietary C3Grid data access interface (webservice) • Part of the use data are provided along with the data extraction

  14. The end

  15. C3Grid Architecture User User Interface API (Web Services) GUI Monitoring Job Submission • DistributedGrid Infrastructure • GT4 based • new Metadata-Service Search Workflow Scheduler DMS (global) Matchmaking DIS ResourceInformationService Staging Data Transfer Service Harvesting Task Execution Site C3Grid Components OAI / WS File Management DMS (local) Resource Scheduler Base Data & Meta Data Pre-Proc Data Job Meta Data ArchiveInterface Grid Workspace AvailableResources DBMS/File DistributedData Archives Distributed Processing Resources

More Related