130 likes | 299 Views
GRADD: Scientific Workflows. Scientific Workflow E. Science laboris. Workflows are the new rock and roll of eScience Machinery for coordinating the execution of (scientific) services and linking together (scientific) resources. Era of service oriented apps (SOA)
E N D
Scientific Workflow E. Science laboris • Workflows are the new rock and roll of eScience • Machinery for coordinating the execution of (scientific) services and linkingtogether (scientific) resources. • Era of service oriented apps (SOA) • Repetitive and mundane boring tasks made easier (data cleaning...) • Facilitates sharing of science
Trident Scientific Workflow Workbench • Visually program workflows, through a web browser • Libraries of activities, workflows and services • Social annotations and search • Abstract parallelism, for HPC & many core (CCR) • Adaptive workflows, to detect and respond to events • Automatic provenance capture, open provenance model • Costing model, resources include time, power, data xfer • Integrated data storage and access • Integrated visualization tools • Fault tolerance, facilitate smart reruns, what-if analysis • Factory scheduling of workflows
Trident ImplementationBuilt on top of industrial workflow engine Windows Workflow Foundation • Workflow in a general purpose framework • Part of Microsoft’s .NET Framework 3.5
TridentLogical Architecture • Domain specific custom activities • Visual Workflow Designer • Runtime Services • Provenance • Fault Tolerance • HPC Scheduling Service • Monitoring Service • Registry • Runtime Admin Tools • Community Site
Extend activity Compose activities Read from Sensor Activities: An Extensible Approach Domain-Specific Workflow Packages Custom Activity Libraries Base Activity Library Rosetta net Biology Out-of-Box Activities CRM Oceanography • Create/Extend/ Compose activities • Read from sensors, • Data pipelines, etc. • First-class citizens • OOB activities, workflow types, • General-purpose • Basic workflow constructs • Domain-specific activities • Domain specific workflow packages - oceanography
Trident Workflow DesignerVisually compose, search and archive (share)
Workflow Execution Provenance For a workflow management system, provenance identifies what activities were executed, parameters supplied at runtime, data passed between activities, intermediate results generated, etc • Explain how a workflow result was created – sufficient to establish trust; • Provides a replication recipe; • Guide development of future experiments; Scientists routinely record the provenance of bench experiments in lab notebooks – this is essential for computational experiments as well.
Provenance in Trident Enactment engine documents all steps linking original inputs with final result so execution can be verified, reproduced or rerun – provenance is a first class data product in Trident… • Provenance capture is automatic and transparent • Will persist provenance data for a fixed period of time. • Supports multiple levels of representation. • Storage provided by underlying system • Interface to query and reason over provenance data. • Efficient storage representation and query performance.
Trident Registry Applications and Scientists need a Curated Registry of ServicesJust having a workflow system isn’t enoughand it’s not just about workflows... Note: Registry, not repositoryServices are hosted elsewhere