1 / 22

A PROPOSED EARTH SCIENCE COLLABORATORY

A PROPOSED EARTH SCIENCE COLLABORATORY . K-S Kuo 1,2 , Chris Lynnes 1 , Rahul Ramachandran 3 1 NASA Goddard Space Flight Center, USA 2 Caelum Research Corporation, USA 3 University of Alabama-Huntsville, USA. Why ESC?. Data Intensive Science Many forms and sources of data

lyle
Download Presentation

A PROPOSED EARTH SCIENCE COLLABORATORY

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A PROPOSED EARTH SCIENCE COLLABORATORY K-S Kuo1,2, Chris Lynnes1, Rahul Ramachandran3 1NASAGoddard Space Flight Center, USA 2Caelum Research Corporation, USA 3University of Alabama-Huntsville, USA IGARSS 2011, Vancouver, Canada

  2. Why ESC? • Data Intensive Science • Many forms and sources of data • In situ measurements • Remote sensing observations • Model simulations • Large volumes of data • Effectiveness as a scientist • Increasing proportion of effort in data management • Threatening: • Reproducibility • Correctness • Productivity IGARSS 2011, Vancouver, Canada

  3. What is an ESC? Vision of arich model development/simulation and data analysis environment that: • Provides access to various Earth Science models • Facilitates model and analysis software development • Provides access across a wide spectrum of Earth Science data • Provides a diverse set of science analysis services and tools • Supports the application of services and tools to data • Supports collaboration, i.e. sharing of data, tools and results • Supports discovery and publication of all science artifacts Basically, a new and natural place for Earth scientists to conduct their work and collaborate with others! IGARSS 2011, Vancouver, Canada

  4. The Situation TodayIslands of data and services with selective connectivity Data Center A Data Center C Data Center B IGARSS 2011, Vancouver, Canada

  5. High-Level View Cyberinfrastructure Laboratory Notebook Workflow Mediator Tool Library Data Library Data Centers IGARSS 2011, Vancouver, Canada

  6. Tool Library • Discovery • Social • Sharing • Tagging • Discussion • Configuration Management • Testing • Versioning • Packager • autoconf • RPM • Web wrapper • Provisioned • GrADS • IDL • MatLab • ncl • nco • cdat • Contributed • [Tool 1] • [Tool 2] • [Tool 3] • [Tool 4] • [Tool 5] • … • Community • Quality filter • Coincidence • Feature detection • Event service • Visualization • Personal • [Tool 1] • [Tool 2] • [Tool 3] • [Tool 4] • [Tool 5] • … IGARSS 2011, Vancouver, Canada

  7. Data Library • Cache • Discovery • Social • Sharing • Tagging • Discussion • Configuration Management • Testing • Versioning • Packager • data probe • format check • metadata wizard • Provisioned • EOSDIS • Contributed • [Dataset 1] • [Dataset 2] • [Dataset 3] • [Dataset 4] • [Dataset 5] • … • Community • Field campaigns • MEaSUREs • ACCESS • Validation • Personal • [Dataset 1] • [Dataset 2] • [Dataset 3] • … IGARSS 2011, Vancouver, Canada

  8. Workflow Library • Discovery • Social • Sharing • Tagging • Discussion • Configuration Management • Testing • Versioning • Packager • Workflow editor • Provisioned • Processing Algorithms • Contributed • [Workflow 1] • [Workflow 2] • [Workflow 3] • [Workflow 4] • [Workflow 5] • … • Community • GeoBrain • SciFlo • Data Mining • Giovanni • Personal • [Workflow 1] • [Workflow 2] • [Workflow 3] • … IGARSS 2011, Vancouver, Canada

  9. Laboratory Notebook • Discovery • Social • Sharing • Tagging • Discussion • Configuration Management • Versioning • Packager • Project Manager • Experiment manager • Notebook editor • Provisioned • Tutorials • User guides • Example uses • Educational packages • Project • [Project 1] • [Project 2] • [Project 3] • [Project 4] • [Project 5] • … • Community • Project results • Publications • Example cases • Educational packages • Personal • Notes • Journals • … IGARSS 2011, Vancouver, Canada

  10. Mediator • Mediates tool interaction with data • OPeNDAP – a common data model (accessible by most tools) • Custom modules reformat data for the rest of the tools • Ontology matches tools with data, and vice versa. IGARSS 2011, Vancouver, Canada

  11. CyberinfrastructureServices used by all other components • Security • authentication • authorization • code audit/padded cell • integrity checking • Social • tagging • sharing • discussions • groups • Cloud • elastic provisioned storage and computing • Discovery • data, tools, workflows, experiments • search by keyword, variable, time, author • Information Management • provenance • identifiers • archive • Semantic Web • data ontology • tools ontology IGARSS 2011, Vancouver, Canada

  12. Key Advantages of ESC • Tool availability will be a force multiplier • More tools will be usable with more datasets • More tools will be more available to more users • Knowledge sharing evolves from text on paper to a rich mixture of data, tools, workflows and articles • A “wikihow” for Earth Science data analysis • Incorporating live data, services and workflows • ESC maintains a record of the analysis process • Share, repeat, build upon analysis techniques • Transparency of the process is built in IGARSS 2011, Vancouver, Canada

  13. Prior Art • Talkoot, myExperiment.org– workflow sharing, virtual notebooks • Earth System Grid – provisioned tools, format standards/checkers • NASA Earth Exchange (NEX) • Land Information System – OPeNDAP as access infrastructure • Earth Science Modeling Framework – programmatic approach to integration • Giovanni, LAS – community services/tools • Canadian Space Science Data Portal (EOS, Feb. 22, 2011) • Nebula – cloud provisioning IGARSS 2011, Vancouver, Canada

  14. A Use CaseGPM Precipitation Retrieval Algorithm Development • GPM Core Satellite: Dual-Frequency Precipitation Radar (JAXA) and GPM Microwave Imager (NASA) • GPM Constellation: International partner satellites with mostly microwave radiometers • Retrieval algorithms – 3 types • Radar-only • Radiometer-only • Radar-radiometer-combined • Participants in algorithm development are distributed in Japan, NASA centers (GSFC, MSFC, JPL), NCAR, and universities (FSU, Uwisc, etc.) IGARSS 2011, Vancouver, Canada

  15. A Use CaseGPM Algorithm Development – Current Situation • Interdependence among 3 types of algorithms • Communication/Coordination– Narrow bandwidth • Periodic workshop meetings and teleconferences • Data access – Duplicative • Each location/group has a copy or subset of required data • Sharing of data/tools – Individual, not concerted • through ftp/email • Knowledge sharing – Delayed IGARSS 2011, Vancouver, Canada

  16. A Use CaseGPM Algorithm Development – with ESC Cloud A Z Tools Tools ESC Client ESC Client VM Image VM Image B A Tools Tools Data Data mySci Cat. mySci Cat. Data Data Community Catalog ESC IGARSS 2011, Vancouver, Canada

  17. A Use CaseGPM Algorithm Development – Multi-level Membership L J D C B K A I M G F E H GPM Combined Algorithm Radar-Only Radiometer-Only

  18. A Use CaseGPM Algorithm Development – in ESC • Enhanced communication/coordination – wide bandwidth • Efficient data access – less duplication • Improved sharing – more pervasive • Effective knowledge sharing – immediate IGARSS 2011, Vancouver, Canada

  19. Thank you! IGARSS 2011, Vancouver, Canada

  20. Why now? • Because we can do it (finally)! • Advances in standards acceptance andimplementation (OPeNDAP, autoconf) • A consistent, loosely coupled architecture encapsulates complexity and maximizes flexibility • Social networking has reached the mainstream • Key lessons can be learned from prior efforts • The need is growing • Interest in working with multiple datasets is growing • Calls for transparency and reproducibility are growing IGARSS 2011, Vancouver, Canada

  21. What’s New? • Macro View (forest-level) • Systematic approach to making data available to services and vice versa • Integration of all major analysis components • Consistent view of all architectural components • Cyberinfrastructure services for all architectural components • Micro View (tree-level): Nothing! IGARSS 2011, Vancouver, Canada

  22. How to move forward? • Option 1 • RFC to community on feasibility, challenges, approach • Followed by RFPs for component and integration • Option 2 • Narrow end-to-end prototype • Followed by refactoring and broadening IGARSS 2011, Vancouver, Canada

More Related