250 likes | 410 Views
E nvironments COO peration. An Overview of Scientific Workflows: Domains & Applications. Presented by Khaled Gaaloul. Laboratoire Lorrain de Recherche en Informatique et ses Applications. Plan. Context & Problematic State of Art In Progress Conclusion & Perspectives. 1.
E N D
Environments COOperation An Overview of Scientific Workflows: Domains & Applications Presented byKhaled Gaaloul Laboratoire Lorrain de Recherche en Informatique et ses Applications
Plan • Context & Problematic • State of Art • In Progress • Conclusion & Perspectives 1
Context: Scientific applications • Need of WFMS for the orchestration and optimization of the scientific endeavors. • Collecting, generating and analyzing of a large data flow • Need of mechanisms supporting interactions between heterogeneous applications 2
Labo.3 • Context: Scientific applications integration Labo.1 Labo.2 Step2 Step5 X O R A N D A N D Step1 Step4 Step6 Definition & specification of processes Step3 Data flow managing Process orchestration Labo.4 Dynamic Scheduling of a Scientific Process 3
Prerequisites for scientific applications • High flexibility degree • High-performance for resources distribution • Workflow ad hoc architecture: moving and hierarchical • Data flow Management: - Automate data streaming - Enriching the semantic level - Documentation & reutilisability 4
Problematic: How to optimize and orchester scientific processes execution? • Problems in managing shared resources: heterogeneous environment, virtual organizations (VO), etc. • Moving Applications: Non-determinism aspect • Current approaches: lack of reutilisability and documentations, business process oriented • Evolution format within data exchanges 5
To deal with heterogeneity To deal with data exchange Step2 Step5 A N D Step4 Step1 A N D X O R Step3 Step6 • Problematic: New requirements Designers sub process1 sub process3 sub process2 6
ScientificWorkflow GRID PBIO • Scientific workflow • Definition: the application of workflow technology to scientific endeavors, recognized as a valuable approach for assisting scientists in accessing and analyzing data. • Features: - Support for large data flows; - Dynamic environment; - Incomplete workflow: partial definition; - Ad hoc planning; - Reutilisabilty, documentation, etc. 7
Scientific Workflow Scientific domain: dedicated to the data flow managing More dynamic: non predefined workflow Traceability and documentation: enriching the semantic level within data exchanges Business Workflow Business domain: dedicated to the processes managing and optimization Lot of constraints: predefined workflow, satisfying end, execution constraints, etc. Lack of formalism: Syntactic level Scientific Workflow GRID PBIO 8
Scientific Workflow GRID PBIO • Scientific Workflow Vs Business Workflow 9
ScientificWorkflow GRID PBIO • GRID (Globalization of Informatics' Resources and Data) • Solution for intensive computing • Virtual organization (VO) - including different users committees - sharing global resources (storing, processing) - Strong impact on organization structure, networks, security 10
Scientific Workflow GRID PBIO • GridFlow (1): GRID and Workflow? • GRID complexity - Virtual organization - Needs of visualization, managing, and simulation • WfMS as a Grid service - Transparent access to one or many GRID regrouping heterogeneous machines - Portals for users 11
Scientific Workflow GRID PBIO • GridFlow (2): Architecture 12
Scientific Workflow GRID PBIO • PBIO: or how to deal with format evolution? • Heterogeneous environment, ad hoc solutions - Data exchanges and complex communication - Format evolution: lack of standardization of data streaming • PBIO (Portable Binary Input/Output) - Approach to deal with binary data in storage and transmission - Record oriented binary communication mechanism - Data meta-representation - Optimizing data storage/transmission - Improving the communication between processes 13
Cooperative processes for scientific workflows • Cooperation between applications - Applications more flexible - Working and communicating within the same virtual space of work - Doing common tasks in synchronous or asynchronous way • BONITA: a flexible system for cooperative workflow - Define, specify, execute, and coordinate different flows of work - Based on the anticipating model - Ensure an interface for the modeling and the visualization of the processes - Managing flexible data 14
Deploying the scenario into Bonita • Enhance execution flexibility • Anticipation: process optimizing 16
Mapping Data-Intensive Science into BONITA • Considerable data flows • Goal: Optimize the data streaming & enhance the data exchange mechanism Data flow computing 17
Discussions • Existing approach: Flow-Based Programming (FBP) - A new/old approach to scientific application development - Data flow Vs. Workflow: which one fit to us? - Anticipating an activity, is it possible with a partial result? • PBIO implementation - Interactivity with Bonita services call - Need of middleware like Echo Event to support messages exchange - Portability of the PBIO approach for existing platforms 17
Conclusions: • Cooperative aspect for scientific applications • Combining strong concepts (GRID & workflows) • Developing a new middleware for scientific process • Perspectives: • Application onto the GRID: Bonita as a GRID service • Adding Non intrusive and user friendly aspects • Collaboration with AURARYD on others scenarios (Volkswagen, BP) 18