160 likes | 289 Views
Globus Grid Tutorial Part 2: Running Programs Across Multiple Resources. Goals of this Tutorial. Learn how to couple resources at multiple sites and use them for a single application Required by very large applications Also by applications that need a heterogeneous mix of resources
E N D
Globus Grid TutorialPart 2:Running Programs Across Multiple Resources
Goals of this Tutorial • Learn how to couple resources at multiple sites and use them for a single application • Required by very large applications • Also by applications that need a heterogeneous mix of resources • Learn how to run existing parallel codes under Globus • By using MPICH-G, a grid-enabled MPI • Other application examples include • SF-Express, climate models, etc.
Gas Dynamics (PPM) • Tightly coupled CFD problem • Needs large computational power • Mask latency by overlapping communication and computation • Move data a brick at a time • Size bricks to CPU and network Archiver Task Manager Brick Manager Brick Manager Brick Updater Brick Updater Brick Updater Woodward, U Minn.
Problems • How do we start a program running across multiple machines? • Co-allocation and scheduling • Different schedulers and security systems • What programming model should be used? • Can we run existing applications?
Globus Advantages • Resource management architecture provides co-allocation tools • Can mix communication methods • Nexus multimethod communication • MPI, sockets, etc. • Uniform access to local services • Security, resource management, etc. • Architecture promotes building high-level programming tools • E.g., MPICH-G, a grid-enabled MPI
Programming Tools: Approaches • “Hand coded” applications, combining existing tools with Globus calls • Use sockets, MPI, threads, SHM, etc. • Globus security and resource management still provide added value • Grid-enabled libraries • Manage both communication and resource management • Provide uniform programming environment across resources
MPICH-G • A complete implementation of the Message Passing Interface (MPI) • Passes MPICH regression test without change • MPI is the defacto standard for message-passing, parallel programs. • Enables existing MPI programs to run within a grid environment without change. • Documentation • http://www.globus.org/mpi
4 2 3 Running a Program • Goal: Run a Message Passing Interface (MPI) program on multiple computers • MPICH-G uses Globus for authentication, resource allocation, executable staging, output redirection, etc. % mpirun -np 4 my_app 1
Running an MPICH-G Program • Create a file named “machines” • A list of Globus managers and counts sp2.sdsc.edu-loadleveler 4 neptune.cacr.caltech.edu-lsf 4 jupiter.isi.edu-fork % mpirun -np 12 my_app • Creates a total of 12 tasks allocated in a round-robin fashion with “count” tasks per allocation request (sp2 4) (neptune 4) (jupiter 1) (sp2 3)
How MPICH-G Works • mpirun: • Locates complete globus resource manager information for specified resources (MDS) • Creates resource specification request • Calls globusrun to execute the program • Uses Nexus for communication • Delivers enhanced performance by using multiple communication protocols
Starting Multiple Jobs • The globusrun command: • Submits multiple simultaneous job requests • Stages executables (GASS) • Waits for termination (GRAM/DUROC) • Forwards stdout/stderr (GASS) • Convenient wrapper around several Globus services: • DUROC, GASS, GRAM, GSI, MDS
Globus Resource Managers • Every resource is controlled by a resource manager called a GRAM • Interfaces to local resource management system, e.g., LoadLeveler, NQE, LSF. • Every resource manage has a unique distinguished name, or DN • DN is a sequence of attribute-value pairs:/C=US/O=Globus/O=USC/OU=ISI/CN=jupiter.isi.edu-fork • The MDS stores information about each resource manage
Limitations of Simple mpirun • Limitations of “machines” file • Executable staging only for homogeneous sets of machines • For heterogeneous sets, executables must be placed in the same location on every machine • More general MPICH-G startup is possible • Dynamic discovery of resources • Specify name of the executable at each site • Specify location of executables and data files • Currently achieved by passing RSL string
Exercise 2Introduction to MPI • Use mpirun to run an MPI program % mpirun -np 2 program • Use globus-rcp to copy files remotely % globus-rcp filename host:filename
Globus Components in Action mpirun globusrun DUROC GRAM GRAM GRAM fork LSF LoadLeveler P2 P2 P2 P1 P1 P1 Nexus
Summary • Using multiple resources located in multiple domains is a basic grid operation • Globus supports this operation via core services and high-level tools • Standard MPI programming environment provides a convenient way of building grid applications • Must be careful about configuration and latency