1 / 27

Pegasus: Running Large-Scale Workflows on the TeraGrid

Learn about how Pegasus automates workflow mapping, improves runtimes, and manages data in complex scientific applications. The system efficiently handles job recovery, retries, and dynamic resources in grid environments. Pegasus integrates with Globus services for resource monitoring, data replication, and executable location. Example workflows like Montage in astronomy and CyberShake for earthquakes illustrate Pegasus' impact on research efficiency and collaboration. Explore how workflow reduction, provenance tracking, and executable mapping optimize scientific computations. Stay informed about SCEC project advancements and the distribution of seismogram jobs on TeraGrid. Acknowledge the contributions from the scientific community and the funding support for these transformative projects.

Download Presentation

Pegasus: Running Large-Scale Workflows on the TeraGrid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pegasus: Running Large-Scale Scientific Workflows on the TeraGrid Ewa Deelman USC Information Sciences Institute http://pegasus.isi.eduwww.isi.edu/~deelman

  2. Acknowledgements • Carl Kesselman, Gaurang Mehta, Gurmeet Singh, Mei-Hui Su, Karan Vahi (Center for Grid Technologies, ISI) • James Blythe, Yolanda Gil (Intelligent Systems Division, ISI) • http://pegasus.isi.edu • Research funded as part of the NSF GriPhyN, NVO and SCEC projects, NIH-funded CRCNS project and EU-funded GridLab • Thanks for the use of the TeraGrid Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  3. Outline • Applications as workflows • Pegasus (Planning for Execution in Grids) • Montage application (Astronomy, NSF&NASA) • CyberShake (Southern California Earthquake Center) • Results from running on the TeraGrid • Conclusions Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  4. Today’s Scientific Applications • Applications • Increasing in the level of complexity • Use of individual application components • Components are supplied by various individuals • Reuse of individual intermediate data products (files) • Execution environment is complex and very dynamic • Resources come and go • Data is replicated • Components can be found at various locations or staged in on demand Separation between • the application description • the actual execution description • Applications being described in terms of workflows Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  5. Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  6. Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  7. Executable Workflow Generation and Mapping WINGS and CAT, developed at ISI by Y. Gil, VDL, developed at ANL & Uof C by I. Foster, J. Voeckler & M. Wilde Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  8. Pegasus:Planning for Execution in Grids • Maps from abstract to executable workflow • Automatically locates physical locations for both workflow components and data • Finds appropriate resources to execute the components • Reuses existing data products where applicable • Publishes newly derived data products • Provides provenance information Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  9. Information Components used by Pegasus • Globus Monitoring and Discovery Service (MDS) (or static file) • Locates available resources • Finds resource properties • Dynamic: load, queue length • Static: location of GridFTP server, RLS, etc • Globus Replica Location Service • Locates data that may be replicated • Registers new data products • Transformation Catalog • Locates installed executables Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  10. Example Workflow Reduction • Original abstract workflow • If “b” already exists (as determined by query to the RLS), the workflow can be reduced • Also useful in case of failures Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  11. Mapping from abstract to executable • Query RLS, MDS, and TC, schedule computation and data movement Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  12. Mosaic of M42 created on the Teragrid resources using Pegasus Pegasus improved the runtime of this application by 90% over the baseline case Workflow with 4,500 nodes Bruce Berriman, John Good (Caltech) Joe Jacob, Dan Katz (JPL) Gurmeet Singh, Mei Su (ISI) Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  13. Small Montage Workflow ~1200 nodes Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  14. Initial prototype implemented and tested on the TeraGrid Montage performance evaluations Production Montage portal open to the astronomy community this year Montage Collaboration with JPL & IPAC Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  15. SCEC • Derive Probabilistic Hazard Curves & maps for the Los Angeles Area: 6 sites in 2005, 625 in 2006, and 10,000 in 2007 • Probability of a certain ground motion during a certain period of time Hazard Map Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  16. SCEC workflows on the TG Executable workflow Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  17. SCEC Workflows on the TG Local machine Gaurang Mehta at ISI ran the experiments Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  18. SCEC computations so far • Pasadena • 33 workflows • USC • 26 workflows • Each workflow • [11, 1000] jobs • 23 days total runtime • NCSA & SDSC TG • Failed job recovery • Retries • Rescue DAG Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  19. So far 2 SCEC sites done (Pasadena and USC) Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  20. Distribution of seismogram jobs 70 hours Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  21. Observations from working with the Scientists • Two way street: they give us feedback on our technologies, we show them how things run (break) at scale • We have seen great performance improvements in the codes Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  22. Some other Pegasus Application Domains • Laser Gravitational Wave Observatory (LIGO) • Galaxy morphology (NVO) • Tomography for neural structure reconstruction (NIH) • High-energy physics • Gene alignment • Natural Language processing Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  23. Courtesy of David Meyers, Caltech LIGO has used Pegasus to run on the Open Science Grid at SC’05 Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  24. Benefits of the workflow & Pegasus approach • Pegasus can run the workflow on a variety of resources • Pegasus can run a single workflow across multiple resources • Pegasus can opportunistically take advantage of available resources (through dynamic workflow mapping) • Pegasus can take advantage of pre-existing intermediate data products • Pegasus can improve the performance of the application. Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  25. Benefits of the workflow & Pegasus approach • Pegasus shields from the Grid details • The workflow exposes • the structure of the application • maximum parallelism of the application • Pegasus can take advantage of the structure to • Set a planning horizon (how far into the workflow to plan) • Cluster a set of workflow nodes to be executed as one (for performance) Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  26. Pegasus Research • resource discovery and assessment • resource selection • resource provisioning • workflow restructuring • task merged together or reordered to improve overall performance • adaptive computing • Workflow refinement adapts to changing execution environment • workflow debugging Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

  27. Software releases • Pegasus http://pegasus.isi.edu • released as part of the GriPhyN Virtual Data System (VDS) • Collaborators in VDS: Ian Foster (ANL) Mike Wilde (ANL) and Jens Voeckler (Uof C) • http://vds.isi.edu Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu

More Related