1 / 18

Intelligent Distributed Data Management in Earth System Science

Intelligent Distributed Data Management in Earth System Science. S. Kindermann, DKRZ, Germany K. Ronneberger, DKRZ, Germany T. Brücher, University of Cologne, Germany H. Ramthun, M&D, Germany M. Stockhause, MPI-Met, IFM-Geomar, Germany. Structure. What is Earthsystem Science about?

gagan
Download Presentation

Intelligent Distributed Data Management in Earth System Science

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Intelligent Distributed Data Management in Earth System Science S. Kindermann, DKRZ, Germany K. Ronneberger, DKRZ, Germany T. Brücher, University of Cologne, Germany H. Ramthun, M&D, Germany M. Stockhause, MPI-Met, IFM-Geomar, Germany

  2. Structure • What is Earthsystem Science about? • Typical workflows • Traditional infrastructure • Why can grid-technology help? • Limits of the current practice • How do we use this technology? • Conceptual outline of the developing infrastructure • Outline of the developed prototype • Potential impact and vision • Next steps and challenges EGEE User Forum `07 Manchester

  3. Motivation: Data in ESS Observation Data + Model Output Data + Analysis Data Scenario Data  Data related to geo-referenced physical variables EGEE User Forum `07 Manchester

  4. ESS Data Management Nowadays „I want to correlate model data from DKRZ with observation data from DWD and satellite data from DLR“ Observation Data Model Data Scenario data Contact each data provider Learn their data search utilities Find and select data 1 Find & Select 2 Collect & Prepare Analysis Dataset Get access rights for datasets at each data provider Learn their data access / preprocessing services Get access to sufficient storage facilities Trigger preprocessing and download data 3 Analyse Result Dataset 4 Visualize At central service provider: • start analysis tools • produce undocumented data „has somebody done similiar things i want ?? Can i reuse data for …??“ copy to local resources create visualization EGEE User Forum `07 Manchester

  5. Bridging C3Grid and EGEE C3Grid: • Standardized metadata description • Uniform discovery of German data providers • Uniform data access • Grid based data delivery World Data Centers DKRZ,DWD AWI, GKSS, … 1 Find & Select 2 Collect & Prepare Analysis Dataset 3 Analyse EGEE: • established international collaboration platform • secure data management  data analysis and data sharing platform Result Dataset C3Grid Middleware 4 Visualize Key component: (ISO) metadata catalog for ESS data in EGEE EGEE User Forum `07 Manchester

  6. C3 Grid and EGEE - the components • Central web-portal:unique entrance point to common central metadata catalogue (Lucene index) and access facility • Standardized Metadata: hierarchical description of discovery- and some use-aspects of the data (ISO 19115/ISO 19139) • Standardized data request interface:hide the complexity of specific data access mechanisms and pre-processing functionality (webservice technology) • Automatic update and republishing of metadata:metadata of data processing is logged, managed and can be harvested (AMGA + java extension, OAI-PMH server ) Find & select Collect & prepare analyse visualize EGEE User Forum `07 Manchester

  7. SE • Publish (ISO 19115/19139) CE WN WN WN (f) Publish (ISO 19115/19139) WN WN WN OAI-PMH server Webservice Interface (b) Harvest (OAI-PMH) (g) Harvest (OAI-PMH) (1) EGEE and C3Grid: Discovery WDC Climate, WDC RSAT, WDC Mare, DWD, AWI, PIK, IFMGeomar, MPI-Met, GKSS EGEE LFC Catalog Data Resource Metadata C3Grid data interface Climate Data Workspace AMGA Metadata Catalog UI OAI-PMH server Webservice Interface Lucene Index Web Portal C3 EGEE User Forum `07 Manchester

  8. (1) EGEE and C3Grid: Data Discovery EGEE User Forum `07 Manchester

  9. (2) EGEE and C3Grid: Data Upload EGEE User Forum `07 Manchester

  10. SE (c) Stage & Provide (f) Transfer & Register (lcg-tools) (b) Retrieve (jdbc or archive) CE WN WN WN (f) Publish (ISO 19115/19139) WN (g) Register (Java-API) WN WN Webservice Interface Webservice Interface (d) notify (e) Reqest (webservice) (a) Reqest (webservice) (2) EGEE and C3Grid: Data Upload EGEE LFC Catalog Data Resource Metadata C3Grid data interface Climate Data Workspace AMGA Metadata Catalog UI OAI-PMH server Webservice Interface OAI-PMH server Webservice Interface Lucene Index Web Portal C3 • Find & Select (2) Collect & Prepare EGEE User Forum `07 Manchester

  11. (3) EGEE and C3Grid: Data Analysis EGEE User Forum `07 Manchester

  12. SE (c) retrieve (lcg-tools) qflux (d) Update (Java-API) CE (b) submit (glite) WN WN WN (f) Publish (ISO 19115/19139) WN WN WN Webservice Interface (e) Return graphic (a) Reqest (webservice) (g) Harvest (OAI-PMH) (3) EGEE and C3Grid: Data Analysis EGEE LFC Catalog Data Resource Metadata C3Grid data interface Climate Data Workspace AMGA Metadata Catalog UI OAI-PMH server Webservice Interface OAI-PMH server Webservice Interface Lucene Index Web Portal C3 (3) Analyse (4) Visualize EGEE User Forum `07 Manchester

  13. (3) Example Workflow • Example: Humidity flux (QFLUX) EGEE User Forum `07 Manchester

  14. Approach in international context EGEE User Forum `07 Manchester

  15. Potential Impact Potential impact on EGEE ESR-community: Provide a framework to easily and consistently exchange and manage esr-data and tools between EGEE and traditional earth science data-storage-systems Potential impact on international ESR-community: Approach is based on international standards (ISO 19139, OAI-PMH) and uniform interfaces (Web services). Thus other data centers and infrastructures can be integrated uniformly EGEE User Forum `07 Manchester

  16. Next steps • Expand the demonstrated prototype to a reliable and stable system • Porting further workflows and some pre-processing functionalities to EGEE • Enlarge the user community EGEE User Forum `07 Manchester

  17. Future challenges or missing bricks • Comprehensive and consistent security context to control access to (restricted) data with a single sign-on • Approach: federated AA infrastructure based on Shibboleth • Analysis-services description to improve discovery, use and share possibilities • Approach: adapt ISO19119/19139 as a common metadata format for analysis-tool description • Modularized workflows to increase the flexibility and enable intelligent scheduling • Approach: implement a workflow information service EGEE User Forum `07 Manchester

  18. Thank You kindermann @ dkrz.de EGEE User Forum `07 Manchester

More Related