270 likes | 289 Views
Learn about how Pegasus automates workflow mapping, improves runtimes, and manages data in complex scientific applications. The system efficiently handles job recovery, retries, and dynamic resources in grid environments. Pegasus integrates with Globus services for resource monitoring, data replication, and executable location. Example workflows like Montage in astronomy and CyberShake for earthquakes illustrate Pegasus' impact on research efficiency and collaboration. Explore how workflow reduction, provenance tracking, and executable mapping optimize scientific computations. Stay informed about SCEC project advancements and the distribution of seismogram jobs on TeraGrid. Acknowledge the contributions from the scientific community and the funding support for these transformative projects.
E N D
Pegasus: Running Large-Scale Scientific Workflows on the TeraGrid Ewa Deelman USC Information Sciences Institute http://pegasus.isi.eduwww.isi.edu/~deelman
Acknowledgements • Carl Kesselman, Gaurang Mehta, Gurmeet Singh, Mei-Hui Su, Karan Vahi (Center for Grid Technologies, ISI) • James Blythe, Yolanda Gil (Intelligent Systems Division, ISI) • http://pegasus.isi.edu • Research funded as part of the NSF GriPhyN, NVO and SCEC projects, NIH-funded CRCNS project and EU-funded GridLab • Thanks for the use of the TeraGrid Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Outline • Applications as workflows • Pegasus (Planning for Execution in Grids) • Montage application (Astronomy, NSF&NASA) • CyberShake (Southern California Earthquake Center) • Results from running on the TeraGrid • Conclusions Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Today’s Scientific Applications • Applications • Increasing in the level of complexity • Use of individual application components • Components are supplied by various individuals • Reuse of individual intermediate data products (files) • Execution environment is complex and very dynamic • Resources come and go • Data is replicated • Components can be found at various locations or staged in on demand Separation between • the application description • the actual execution description • Applications being described in terms of workflows Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Executable Workflow Generation and Mapping WINGS and CAT, developed at ISI by Y. Gil, VDL, developed at ANL & Uof C by I. Foster, J. Voeckler & M. Wilde Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Pegasus:Planning for Execution in Grids • Maps from abstract to executable workflow • Automatically locates physical locations for both workflow components and data • Finds appropriate resources to execute the components • Reuses existing data products where applicable • Publishes newly derived data products • Provides provenance information Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Information Components used by Pegasus • Globus Monitoring and Discovery Service (MDS) (or static file) • Locates available resources • Finds resource properties • Dynamic: load, queue length • Static: location of GridFTP server, RLS, etc • Globus Replica Location Service • Locates data that may be replicated • Registers new data products • Transformation Catalog • Locates installed executables Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Example Workflow Reduction • Original abstract workflow • If “b” already exists (as determined by query to the RLS), the workflow can be reduced • Also useful in case of failures Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Mapping from abstract to executable • Query RLS, MDS, and TC, schedule computation and data movement Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Mosaic of M42 created on the Teragrid resources using Pegasus Pegasus improved the runtime of this application by 90% over the baseline case Workflow with 4,500 nodes Bruce Berriman, John Good (Caltech) Joe Jacob, Dan Katz (JPL) Gurmeet Singh, Mei Su (ISI) Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Small Montage Workflow ~1200 nodes Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Initial prototype implemented and tested on the TeraGrid Montage performance evaluations Production Montage portal open to the astronomy community this year Montage Collaboration with JPL & IPAC Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
SCEC • Derive Probabilistic Hazard Curves & maps for the Los Angeles Area: 6 sites in 2005, 625 in 2006, and 10,000 in 2007 • Probability of a certain ground motion during a certain period of time Hazard Map Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
SCEC workflows on the TG Executable workflow Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
SCEC Workflows on the TG Local machine Gaurang Mehta at ISI ran the experiments Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
SCEC computations so far • Pasadena • 33 workflows • USC • 26 workflows • Each workflow • [11, 1000] jobs • 23 days total runtime • NCSA & SDSC TG • Failed job recovery • Retries • Rescue DAG Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
So far 2 SCEC sites done (Pasadena and USC) Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Distribution of seismogram jobs 70 hours Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Observations from working with the Scientists • Two way street: they give us feedback on our technologies, we show them how things run (break) at scale • We have seen great performance improvements in the codes Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Some other Pegasus Application Domains • Laser Gravitational Wave Observatory (LIGO) • Galaxy morphology (NVO) • Tomography for neural structure reconstruction (NIH) • High-energy physics • Gene alignment • Natural Language processing Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Courtesy of David Meyers, Caltech LIGO has used Pegasus to run on the Open Science Grid at SC’05 Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Benefits of the workflow & Pegasus approach • Pegasus can run the workflow on a variety of resources • Pegasus can run a single workflow across multiple resources • Pegasus can opportunistically take advantage of available resources (through dynamic workflow mapping) • Pegasus can take advantage of pre-existing intermediate data products • Pegasus can improve the performance of the application. Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Benefits of the workflow & Pegasus approach • Pegasus shields from the Grid details • The workflow exposes • the structure of the application • maximum parallelism of the application • Pegasus can take advantage of the structure to • Set a planning horizon (how far into the workflow to plan) • Cluster a set of workflow nodes to be executed as one (for performance) Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Pegasus Research • resource discovery and assessment • resource selection • resource provisioning • workflow restructuring • task merged together or reordered to improve overall performance • adaptive computing • Workflow refinement adapts to changing execution environment • workflow debugging Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Software releases • Pegasus http://pegasus.isi.edu • released as part of the GriPhyN Virtual Data System (VDS) • Collaborators in VDS: Ian Foster (ANL) Mike Wilde (ANL) and Jens Voeckler (Uof C) • http://vds.isi.edu Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu