270 likes | 289 Views
Pegasus: Running Large-Scale Scientific Workflows on the TeraGrid. Ewa Deelman USC Information Sciences Institute. http://pegasus.isi.edu www.isi.edu/~deelman. Acknowledgements. Carl Kesselman, Gaurang Mehta, Gurmeet Singh, Mei-Hui Su, Karan Vahi (Center for Grid Technologies, ISI)
E N D
Pegasus: Running Large-Scale Scientific Workflows on the TeraGrid Ewa Deelman USC Information Sciences Institute http://pegasus.isi.eduwww.isi.edu/~deelman
Acknowledgements • Carl Kesselman, Gaurang Mehta, Gurmeet Singh, Mei-Hui Su, Karan Vahi (Center for Grid Technologies, ISI) • James Blythe, Yolanda Gil (Intelligent Systems Division, ISI) • http://pegasus.isi.edu • Research funded as part of the NSF GriPhyN, NVO and SCEC projects, NIH-funded CRCNS project and EU-funded GridLab • Thanks for the use of the TeraGrid Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Outline • Applications as workflows • Pegasus (Planning for Execution in Grids) • Montage application (Astronomy, NSF&NASA) • CyberShake (Southern California Earthquake Center) • Results from running on the TeraGrid • Conclusions Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Today’s Scientific Applications • Applications • Increasing in the level of complexity • Use of individual application components • Components are supplied by various individuals • Reuse of individual intermediate data products (files) • Execution environment is complex and very dynamic • Resources come and go • Data is replicated • Components can be found at various locations or staged in on demand Separation between • the application description • the actual execution description • Applications being described in terms of workflows Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Executable Workflow Generation and Mapping WINGS and CAT, developed at ISI by Y. Gil, VDL, developed at ANL & Uof C by I. Foster, J. Voeckler & M. Wilde Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Pegasus:Planning for Execution in Grids • Maps from abstract to executable workflow • Automatically locates physical locations for both workflow components and data • Finds appropriate resources to execute the components • Reuses existing data products where applicable • Publishes newly derived data products • Provides provenance information Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Information Components used by Pegasus • Globus Monitoring and Discovery Service (MDS) (or static file) • Locates available resources • Finds resource properties • Dynamic: load, queue length • Static: location of GridFTP server, RLS, etc • Globus Replica Location Service • Locates data that may be replicated • Registers new data products • Transformation Catalog • Locates installed executables Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Example Workflow Reduction • Original abstract workflow • If “b” already exists (as determined by query to the RLS), the workflow can be reduced • Also useful in case of failures Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Mapping from abstract to executable • Query RLS, MDS, and TC, schedule computation and data movement Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Mosaic of M42 created on the Teragrid resources using Pegasus Pegasus improved the runtime of this application by 90% over the baseline case Workflow with 4,500 nodes Bruce Berriman, John Good (Caltech) Joe Jacob, Dan Katz (JPL) Gurmeet Singh, Mei Su (ISI) Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Small Montage Workflow ~1200 nodes Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Initial prototype implemented and tested on the TeraGrid Montage performance evaluations Production Montage portal open to the astronomy community this year Montage Collaboration with JPL & IPAC Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
SCEC • Derive Probabilistic Hazard Curves & maps for the Los Angeles Area: 6 sites in 2005, 625 in 2006, and 10,000 in 2007 • Probability of a certain ground motion during a certain period of time Hazard Map Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
SCEC workflows on the TG Executable workflow Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
SCEC Workflows on the TG Local machine Gaurang Mehta at ISI ran the experiments Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
SCEC computations so far • Pasadena • 33 workflows • USC • 26 workflows • Each workflow • [11, 1000] jobs • 23 days total runtime • NCSA & SDSC TG • Failed job recovery • Retries • Rescue DAG Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
So far 2 SCEC sites done (Pasadena and USC) Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Distribution of seismogram jobs 70 hours Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Observations from working with the Scientists • Two way street: they give us feedback on our technologies, we show them how things run (break) at scale • We have seen great performance improvements in the codes Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Some other Pegasus Application Domains • Laser Gravitational Wave Observatory (LIGO) • Galaxy morphology (NVO) • Tomography for neural structure reconstruction (NIH) • High-energy physics • Gene alignment • Natural Language processing Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Courtesy of David Meyers, Caltech LIGO has used Pegasus to run on the Open Science Grid at SC’05 Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Benefits of the workflow & Pegasus approach • Pegasus can run the workflow on a variety of resources • Pegasus can run a single workflow across multiple resources • Pegasus can opportunistically take advantage of available resources (through dynamic workflow mapping) • Pegasus can take advantage of pre-existing intermediate data products • Pegasus can improve the performance of the application. Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Benefits of the workflow & Pegasus approach • Pegasus shields from the Grid details • The workflow exposes • the structure of the application • maximum parallelism of the application • Pegasus can take advantage of the structure to • Set a planning horizon (how far into the workflow to plan) • Cluster a set of workflow nodes to be executed as one (for performance) Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Pegasus Research • resource discovery and assessment • resource selection • resource provisioning • workflow restructuring • task merged together or reordered to improve overall performance • adaptive computing • Workflow refinement adapts to changing execution environment • workflow debugging Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu
Software releases • Pegasus http://pegasus.isi.edu • released as part of the GriPhyN Virtual Data System (VDS) • Collaborators in VDS: Ian Foster (ANL) Mike Wilde (ANL) and Jens Voeckler (Uof C) • http://vds.isi.edu Ewa Deelman, deelman@isi.edu www.isi.edu/~deelman pegasus.isi.edu