1 / 13

WS-VLAM: Towards a Scalable Workflow System on the Grid

V. Korkhov, D. Vasyunin, A. Wibisono, V. Guevara-Masis, A. Belloum vkorkhov@science.uva.nl Institute of informatics Faculty of Science University of Amsterdam. WS-VLAM: Towards a Scalable Workflow System on the Grid. Outline. Introduction: what is WS-VLAM? Architecture of the WS-VLAM

erik
Download Presentation

WS-VLAM: Towards a Scalable Workflow System on the Grid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. V. Korkhov, D. Vasyunin, A. Wibisono, V. Guevara-Masis, A. Belloum vkorkhov@science.uva.nl Institute of informatics Faculty of Science University of Amsterdam WS-VLAM: Towards a Scalable Workflow System on the Grid

  2. Outline • Introduction: what is WS-VLAM? • Architecture of the WS-VLAM • Large-scale workflow support: • Distributed workflow engine and multi-cluster execution support • Hierarchical resource management and workload balancing • Workflow farming • Semantic workflow support • Conclusions

  3. Introduction WS-VLAM (Virtual Lab AMsterdam) concepts: • Data driven workflow system • Data streaming between workflow components running on the Grid • Components: input and output ports for data exchange; parameters for control (during runtime as well); graphical output (X11) supported • GUI and engine decoupled, interfaced using WS-RF Engine (RTS – Run Time System): • Implemented as GT4 WS-RF service • Uses GT4 features (delegation service, GSI, notifications etc.)

  4. WS-VLAM architecture

  5. Large-scale distributed workflows support • Multi-cluster distributed experiments: distributed workflow engine • Heterogeneous resources: workload balancing and resource management • Complex workflows with parameter sweeps and iterative processing: workflow farming • Semantic support

  6. Distributed workflow engine WS-VLAM GUI GT4 Service Container GT4 Service Container EPR WS-RTSM Factory WS-RTSM Factory GRAM GRAM Resource Manager Distributed RTSM Distributed RTSM WS-RTSM Instance WS-RTSM Instance GUI proxy Data proxy Data proxy GUI proxy Cluster 1 Cluster 2 Worker nodes Worker nodes Workflow components Workflow components Workflow components Workflow components

  7. Hierarchical resource managementand workload balancing • Task level: Adaptive workload balancing for parallel applications (MPI) on heterogeneous resources • Job level: inter-task workload distribution and balancing for multi-task applications (DIANE user-level scheduling env.) • Workflow level: workflow farming

  8. Workload balancing strategy(parallel and multi-task applications) • Distribution of divisible workload between tasks based on application characteristics (communications/computations ratio) and resource characteristics (CPU, memory, bw) • Weights are assigned to all the resources that execute tasks according to their capacities • Fast heuristic algorithm for approximate weighting of resources processing the workload • Iterative processing of similar data; measuring execution performance for each iteration and adapting weights (and thus workload distribution) on the fly

  9. Workflow farming: adaptive data distribution W=1 WF 1 WF 1 is twice as slow! W=2 WF Distributor Estimator Iterative processing: Independent data or parameters WF 2 W=2 Each farmed workflow gets a single data element to process first to assess its performance. The speed of processing is evaluated, then the future workload distribution is determined according to this information. Weights reflecting the performance are assigned to the workflows.

  10. Workflow farming: WF service WS-RTSM 1 WF 1 RTSM Factory XML topology WS-RTSM 2 1 WF 2 6 5 4 3 2 1 2 4 Data to farm 6 GUI Resource Manager 5 3 WS-RTSM 3 List of WS-RTSM EPRs WF 3 Perf Perf Perf Performance data Workflows WF1,2,3 are running, having WS interface, ready to process data from the RM “on-demand”

  11. Semantic workflow support

  12. Conclusions • WS-VLAM features towards large scale data driven workflows support: • Multi-cluster support for a single workflow, ability for data exchange between internal nodes of different clusters • Adaptive workload balancing for parallel applications (workflow components) on heterogeneous resources • Workload balancing on workflow level: parameter/data sweep for workflow • Semantic support for workflow composition

  13. http://www.vl-e.nl/ http://www.science.uva.nl/~gvlam/wsvlam

More Related