1 / 13

Towards Self-Describing Workflows for Climate Models

Towards Self-Describing Workflows for Climate Models. Kathy Saint – UCAR Ufuk Utku Turuncoglu – ITU Sylvia Murphy – NCAR Cecelia DeLuca – NCAR. Outline. Motivation Application Implementation Collecting Provenance Future Steps Analysis of Kepler. Motivation.

elwyn
Download Presentation

Towards Self-Describing Workflows for Climate Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards Self-Describing Workflows for Climate Models Kathy Saint – UCAR Ufuk Utku Turuncoglu – ITU Sylvia Murphy – NCAR Cecelia DeLuca – NCAR

  2. Outline • Motivation • Application • Implementation • Collecting Provenance • Future Steps • Analysis of Kepler

  3. Motivation • Problems in typical Earth System Modeling Application • Changing the science in complex Earth system models can involve numerous parameter changes that are hard to record and track • HPC is complex and involves many technologies each with its own learning curve • Reproducibility is becoming increasingly important • It is not easy to share information (configuration parameters, results, post-processing scripts)

  4. Motivation (cont.) • Approach: • The user can create a different case with only minor changes in the workflow • The workflow layer can hide the details of different technologies such as the computing environment, model and post-processing tools etc. • Users can query collected standardized provenance information to compare, debug, or reproduce the results • Users can share information easily: • They can run same case with different input and parameters

  5. Components of Workflow Environment The workflow encapsulates the technical details of the compute platform and allows the user to focus on the science of the model.

  6. Conceptual Workflow Workflow includes uploading source code; creating, building and running case; and collecting provenance data.

  7. Implementation The implementation can be mapped back directly to the conceptual workflow.

  8. Collecting Provenance • Provenance is defined as structured information that keeps track of the origin and derivation of the workflow. • The basic types of provenance information: • System (system environment, OS, CPU architecture, compiler versions etc.) • Data (history or lineage of data, data flows, input and outputs, data transformations) • Process (statistics about workflow run, transferred data size, elapsed time etc.) • Workflow (version, modifications etc.)

  9. Collecting Provenance • CCSM is a multi-component model and which makes it complicated to collect provenance information. pymake – provided my ORNL and NCSU [2,7,10] tgwrapper.pl – uses SoftEnv [9] and Modules [8] applications

  10. Future Steps • Integration with Web Services • Move logic from Kepler platform to Web Server platform • Simplifies client, so user doesn’t have to build a custom Kepler with custom actors • Takes advantage of existing actors for communicating with SOAP services • WebServices – for handling simple message types • WSWithComplexType – for handling complex message types • An extension of the ESMF Web Services

  11. Future Steps An idea of what the new, simplified workflow will look like, utilizing web service actors.

  12. Analysis of Kepler • Pros • Ease of Use • Customization • Cons • WSWithComplexType limited & hard to debug • Suggestions • Better discussion boards (searchable)

  13. References [1] Altintas, I., Berkley, C., Jaeger, E., Jones, M., Ludäscher, B., Mock, S., 2004, Kepler: An Extensible System for Design and Execution of Scientific Workflows, 16th Intl. Conf. on Scientific and Statistical Database Management (SSDBM'04), 21-23 June 2004, Santorini Island, Greece. [2] Altintas, I., Chin, G., Crawl, D., Critchlow, T., Koop, D., Ligon, J., Ludaescher, B., Mouallem, P., Nagappan, M., Podhorszki, N., Silva, C., Vouk, M., 2007, Provenance in Kepler-based Scientific Workflow Systems. Microsoft e-Science Workshop, poster. [3] Barton, T., Basney, J., Freeman, T., Scavo, T., Siebenlist, F., Welch, V., Ananthakrishnan, R., Baker, B., Goode, M., and Keahey, K. 2006, Identity Federation and Attribute-based Authorization through the Globus Toolkit, Shibboleth, Gridshib, and MyProxy. 5th Annual PKI R&D Workshop, April 2006. [4] Catlett, C. et al. "TeraGrid: Analysis of Organization, System Architecture, and Middleware Enabling New Types of Applications," HPC and Grids in Action, Ed. Lucio Grandinetti, IOS Press 'Advances in Parallel Computing' series, Amsterdam, 2007. [5] Furlani J. L., "Modules: Providing a Flexible User Environment", Proceedings of the Fifth Large Installation Systems Administration Conference (LISA V), pp. 141-152, San Diego, CA, September 30 - October 3, 1991. [6] Hill, C., C. DeLuca, V. Balaji, M. Suarez, and A. da Silva, (2004). Architecture of the Earth System Modeling Framework. Computing in Science and Engineering, Volume 6, Number 1. [7] Klasky, S.; Barreto, R.; Kahn, A.; Parashar, M.; Podhorszki, N.; Parker, S.; Silver, D.;Vouk, M. A., "Collaborative visualization spaces for petascale simulations," Collaborative Technologies and Systems, 2008. CTS 2008. International Symposium on, vol., no., pp.203-211, 19-23 May 2008 [8] Modules, http://modules.sourceforge.net/ [9] SoftEnv, http://www.mcs.anl.gov/hs/software/systems/msys/ [10] Vouk, M., Altintas, I., Klasky, S., Ludaescher, B., Silva, C., 2008, On SDM Provenance Framework, SDM Provenance White Paper, V3

More Related