1 / 22

Pegasus on the Virtual Grid: A Case Study of Workflow Planning over Captive Resources

Pegasus on the Virtual Grid: A Case Study of Workflow Planning over Captive Resources. Yang-Suk Kee, Eun-Kyu Byun, Ewa Deelman, Kran Vahi, Jin-Soo Kim Oracle US Inc K orea Advanced Institute of Science and Technology I nformation Sciences Institute/University of Southern California

eilis
Download Presentation

Pegasus on the Virtual Grid: A Case Study of Workflow Planning over Captive Resources

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pegasus on the Virtual Grid: A Case Study of Workflow Planning over Captive Resources Yang-Suk Kee, Eun-Kyu Byun, Ewa Deelman, Kran Vahi, Jin-Soo Kim Oracle US Inc Korea Advanced Institute of Science and Technology Information Sciences Institute/University of Southern California Sungkyunkwan University

  2. Overview • Motivation • Background • Pegasus • Virtual Grid • Pegasus-VG Proxy • Conclusion • Discussion

  3. Motivation • Challenges in scientific application development • Data/control flow, task scheduling, data replication, fault-tolerance, etc • Challenges in resource management • Availability, performance, cost, reliability, fault-tolerance, etc • How to leverage existing cyber infrastructures for easy and efficient scientific computing?

  4. Separations of Concerns • Application domain • Workflow management: application management can be conducted independently of target execution environments. • E.g.) Pegasus, Askalon, Triana • Resource domain • Resource provisioning: resource management can be encapsulated underneath abstractions or virtualizations • E.g.) Virtual Grid, virtual cluster, cloud

  5. Workflow planning and execution over provisioned resources

  6. Pegasus • A framework for workflow planning and execution • Workflow lifecycle • Design: describe the data/control flows of application via an abstract workflow • Planning: map the workflow tasks onto physical resources • Execution: schedule and run the workflow tasks on the mapped resources

  7. Pegasus Workflow Management Abstract workflow Condor pool Pegasus mapper Executable workflow Pegasus Condor DAGman Monitoring Information provenance tasks Condor Monitoring Information provenance tasks Computing environment

  8. Virtual Grid • A programmable virtualized resource provisioning framework • Components • vgDL (Virtual Grid Description Language) • Specifies resource requirements • vgES (Virtual Grid Execution System) • Compiles and coordinates resources • PC (Personal Cluster) • Provides uniform job management

  9. Application A A B D vgdl=clusterof (node) [2] { node = [Processor==“P4”] } B C C D program run Virtual Grid Resource Abstraction VGDL VG P4 P4 Classification Selection Binding Environment PBS ok Timeshare VG Lease Timeshare Batch

  10. Pegasus on Virtual Grid • Scope • A basic integration for workflow planning and execution over provisioned resources • Issues • Resource capacity estimation • Resource specification (vgDL) synthesis for Virtual Grid • Resource information publication • Site catalog generation for Pegasus

  11. Resource Capacity Estimation • What Virtual Grid expects from Pegasus • vgDL description • Available information • Task execution time, data transfer time, performance metrics, minimum memory capacity, cost, deadline, etc • Unknown information • # of virtual processors • Resource capacity estimate • Minimize the # of processors that can execute a workflow within a deadline

  12. BTS (Balanced Time Scheduling) p2 p1 1 ID ET 1 1 1 2 3 2 5 Time 2 3 4 5 3 2 4 4 2 5 1 5 6 6 1 6 How many processors do we need to run this workflow within 7 units? Ref: E-science’08 E.-K. Byun, Y.-S. Kee et. al

  13. Example • Execution time of each task - Xeon processor • Data transfer time - network with 1Gbs bandwidth. • Deadline is 1 hour. f.input preprocess Diamond = ClusterOf [2] (nd) [, 0:30:00] { nd = [Processor == “Xeon”] } findrange findrange analyze f.output

  14. Resource Information Publication • What Pegasus expects from Virtual Grid • Site catalog • Virtual Grid • VG instance • Resource information publication • Devirtualize a VG instance and generate a site catalog for Pegasus

  15. Application A A B D vgdl=clusterof (node) [2] { node = [Processor==“P4”] } B C C D program run Virtual Grid Resource Abstraction VGDL VG P4 P4 Classification Selection Binding Environment PBS ok Timeshare VG Lease Timeshare Batch

  16. Personal Cluster • A partition of resources dedicated to a user under the control of a user-level resource manager during a limited time period GT4/PBS GT4/PBS Ref: HCW’08 Y.-S. Kee and C. Kesselman

  17. Site Catalog Publication <sitecatalog xmlns="http://pegasus.isi.edu/schema/sitecatalog" …> … <profile namespace="env" key="PEGASUS_HOME">/home/globus/pegasus-2.1.0</profile> <profile namespace="condor" key="grid_type">gt4</profile> <profile namespace="condor" key="jobmanager_type">PBS</profile> <lrc url="rlsn://cat7.kaist.ac.kr" /> <gridftp url="gsiftp://cat7.kaist.ac.kr:2811" storage="/home/globus" major="4" minor="0" patch="7" /> <jobmanager universe="transfer" url="https://cat7.kaist.ac.kr:9000/wsrf/services/ManagedJobFactoryService" major="4" minor="0" patch="7" total-nodes="2" /> <jobmanager universe="vanilla" url="https://cat7.kaist.ac.kr:9000/wsrf/services/ManagedJobFactoryService" major="4" minor="0" patch="7" total-nodes="2" /> <workdirectory>$HOME/workdir</workdirectory> </site> … </sitecatalog>

  18. Workflow Planning over Provisioned Resources Pegasus VG-Pegasus Proxy Virtual Grid Abstract workflow BTS Creation VGDL A vgdl = ClusterOf (nd) [2] { nd = [Proc==“Xeon”] } B C C C D Planning A Site catalog VG B C C C D Scheduling/ Execution GT4+PBS Devirtualization Executable workflow

  19. Conclusion • Pegasus on Virtual Grid • Implements workflow planning and execution over on-demand captive resources • Enables easy and efficient application development and execution • Issues • Resource capacity estimation • Site catalog publication

  20. Discussion • Effective performance • What is the cost that a user has to pay to have a successful execution? • Ongoing studies • Find-grain planning for resource provisioning • Performance, cost, reliability • Workflow execution for virtualization • Recovery of failed tasks

  21. Need More Information? • Pegaus • http://pegasus.isi.edu • VGrADS • Tuesday, 11:30am, RENCI booth (2633) • Wednesday, noon, GCAS booth (285) • Wednesday, 2:00Pm, SDSC booth (568) • Wednesday, 4:00pm, RENCI booth (2633)

  22. Q & Q U E S T I O N S A N S W E R S A

More Related