1 / 1

A Semantic Type System and Propagation Mechanism for Scientific Workflows

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES. W 0. Workflow Design. Top-Down. t. W 1. t. Task Driven. W 2. Data Driven. Bottom-Up. …. GEON Web Service Based Information Integration. Structure Driven. W m. Output Driven. Semantic Driven. t. Workflow Implementation. W n.

Download Presentation

A Semantic Type System and Propagation Mechanism for Scientific Workflows

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CYBERINFRASTRUCTURE FOR THE GEOSCIENCES W0 Workflow Design Top-Down t W1 t Task Driven W2 Data Driven Bottom-Up … GEON Web Service Based Information Integration Structure Driven Wm Output Driven Semantic Driven t Workflow Implementation Wn Input Driven Scientific Workflow with Semantic Query Annotations (1) Actor-oriented SWF Modeling & Design (1) Semantic Extensions for SWF Design A Semantic Type System and Propagation Mechanism for Scientific Workflows Shawn Bowers2 and Bertram Ludäscher1,2,3 1Dept. of Computer Science, 2Genome Center, UC DAVIS3San Diego Supercomputer Center, UC San Diego www.kepler-project.org • The Problem: Design and Reuse of Scientific Workflows and their Components • Scientific workflows are becoming increasingly important as a unifying mechanism for interlinking scientific data management, analysis, simulation, and visualization tasks. While current systems like Kepler permit the creation of executable workflows (e.g., from local components and web services), conceptual modeling and design of scientific workflows has been largely neglected so far. Thus design and resuse of (possibly thousands of legacy) components, actors, and workflows is difficult. • Our Approach: • We have developed a formal model for scientific workflows based on an actor-oriented modeling and design approach, originally developed for studying models of complex concurrent systems. Actor-oriented modeling separates two modeling concerns: component communication (dataflow) and overall workflow coordination (orchestration). Our framework includes a novel hybrid type system, separating further the concerns of conventional data modeling (structural data type) and conceptual modeling (semantic type). In our design methodology, semantic and structural mismatches can be handled independently or simultaneously via different types of adapters, giving rise to new methods of workflow design. • The Benefits: • Separation of modeling concerns: transport, structure, semantics port types • “smart” discovery and linkage of components and data sets • Workflow graph is an artifact that can be described, analyzed, shared • More independently reusable components • Mix of design strategies:step-wise refinement, bottom-up, top-down strategies, data-oriented, task-oriented, … • Some costly semantic annotations can be automatically derived • Annotation Propagation Problem: Given • a structural schemas S (input) and S’ (ouput) and an ontology O, • a semantic annotation α • a query annotation q • Goal: compute α’ Specific Challenges in Scientific Workflow Design: How to support ... (1) ... scientific workflow design process in general? (2) ... "smart" discovery of components (out of thousands ...) (3) ... "smart" linking of data to components (data binding) (4) ... "smart" linking of components to one another (service composition) (5) ... overall orchestration semantics (6) ... propagation of (semantic) type information Approach: Separation of Concerns in SWF Modeling and Design: Data ports have ... - a transport type (move data via: object, reference, SRB, GridFTP, scp, ..) - a structural type (XML DTD-ish) - a semantic type (OWL-ish) - a token consumption type (in/out rates of tokens/actor firing) (1)  Design methodology based on an abstract model of SWFs; allows mixes of top-down (stepwise refinement), bottom-up, data-driven, task-driven, structure-driven, semantics-driven ... design (2) concept/ontology-based actor discovery (3) semantic annotations of data and actors (4)  use of both structural and semantic types to type-check desired connections and guide suitable pre-/post-actors; introduction of structural and/or semantic adapters ("shims"); basic idea: use logic constraints to express types (5)  employ Ptolemy's Models of Computation/Directors: Process Network, SDF, ... (6)  use query annotations of actors and a procedure similar to the "Chase" (resolution) Interplay between structural and semantic type information Future Plans: Workflow engineers evolve workflows by applying design primitives (left), shown as transformations t; certain primitives can be grouped to form design strategies (right), where each design strategy is shown as a distinct dimension of a design space. Kepler contributors include GEON, SEEK, SDM Center and Ptolemy II, supported by NSF ITRs 022567 (SEEK), EAR-0225673 (GEON), DOE DE-FC02-01ER25486 (SciDAC/SDM), and DARPA F33615-00-C-1703 (Ptolemy).

More Related