10 likes | 148 Views
C. C. C. AAV rules. Data Conversion Rules, & Tools. Run or Resume the Model. SDM Center. Gene Sequence Processing. SDM Center’s Scientific Workflow Environment http://sdm.lbl.gov/sdmcenter/
E N D
C C C AAV rules Data Conversion Rules, & Tools Run or Resume the Model SDM Center Gene Sequence Processing • SDM Center’s Scientific Workflow Environment • http://sdm.lbl.gov/sdmcenter/ • I. Altintas5, S. Bhagwanani4, D. Buttler3, S. Chandra4, Z. Cheng4, M. Coleman3, D. Colonnese4, T. Critchlow3, A. Gupta5, W. Han1, L. Liu1, B. Ludäscher5, C. Pu1, R. Moore5, A. Shoshani2, M. Vouk4 • Georgia Institute of Technology 2. Lawrence Berkeley National Laboratory 3. Lawrence Livermore National Laboratory • 4. North Carolina State University 5. San Diego Supercomputer Center, UCSD A modeling and execution environment for scientific workflows Scientific Workflow Example: Promoter Identification Workflow (simplified) Gene workflow for understanding co-regulation “NEW APPROACH”- Easy to Customize, Generic Approach to Scientific Process Automation “OLD APPROACH”- Custom-built Software • Extends Ptolemy II for scientific workflows: a library of bioinformatics modules, a generic WSDL module, user-driven workflow steering, local and web service-based tasks, a GUI for designing and executing workflows, and tracking workflow execution, XML-based workflow description and exchange language (currently MoML), Java-based open source software, easy packaging and installation on diverse platforms. • Planned extensions: extended suite of data transformation modules, Grid service-based tasks and data streaming, XML Schema and OWL-typed workflow signatures, data provenance. SYSTEM ARCHITECTURE COMPONENTS User Workflow Support Layer Process Monitoring Construction Execution Model: Ptolemy II Process Network Domain SciDAC Extensions to Ptolemy-II Web Service adapter GridService adapter AWF Web service invocation Valid-AWF, EWF Grid service Invocation / Data streaming Validation Errors Web Service Workflow Planner & (Data) Converters Grid Service Query Rewriting Service Matching Semantic type Checking Data Type Conversion XWrap Data-intensive sources, sensors Genbank, Blast, etc. • E.g., Perl-based ad-hocCGI application • Difficult to reuse, change, optimize and maintain • Dependent on third-party software ET schemas ET – Executable task- service AT – Abstract task (“Mini workflow” of ETs; Composition of ETs and ATs) AWF – Abstract workflow support EWF – Executable workflow support Abstract Task (AT) Repository Data & Parameter Ontologies Datatype & Conversion Repository including XWrap, Plus future extensions Executable Task (ET) Repository, Registries, and Libraries (e.g., generic WSDL Plugin), plus future xtensions (*)Planned are in red This work was partially performed under the auspices of the U.S. Department of Energy by the University of California, Lawrence Livermore National Laboratory, under Contract W-7405-Eng-48 and under Contract No. DE-FC02-01ER25486 for SciDAC/SDM.