1 / 30

Condor and MPI Paradyn/Condor Week Madison, WI 2001

Condor and MPI Paradyn/Condor Week Madison, WI 2001. Overview. MPI and Condor: Why Now? Dedicated and Opportunistic Scheduling How Does it All Work? Specific MPI Implementations Future Work. What is MPI?. MPI is the “Message Passing Interface”

dougal
Download Presentation

Condor and MPI Paradyn/Condor Week Madison, WI 2001

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Condor and MPIParadyn/Condor WeekMadison, WI 2001

  2. Overview • MPI and Condor: Why Now? • Dedicated and Opportunistic Scheduling • How Does it All Work? • Specific MPI Implementations • Future Work

  3. What is MPI? • MPI is the “Message Passing Interface” • Basically, a library for writing parallel applications that use message passing for inter-process communication • MPI is a standard with many different implementations

  4. MPI and Condor: Why Haven’t We Supported it Until Now? • MPI's model is a static world • We always saw the world as dynamic, opportunistic, ever-changing • We focused our parallel support on PVM which supported a dynamic environment

  5. MPI With Condor:Why Now? • More and more Condor pools are being formed from dedicated resources • MPI's API is also starting to move towards supporting a dynamic world (e.g. LAM, MPI2, etc) • Few schedulers (if any) handle both opportunistic and dedicated resources at the same time

  6. Dedicated and Opportunistic Scheduling • Resources can move between 'dedicated' and 'opportunistic' status • Users submit jobs that are either dedicated (e.g. Universe = MPI) or opportunistic (e.g. Universe = standard)

  7. Dedicated and Opportunistic (Cont'd) • Condor leaves all resources as opportunistic unless it sees dedicated jobs to service • The Dedicated Scheduler ('DS') claims opportunistic resources and turns them into dedicated ones to schedule into the future

  8. Dedicated and Opportunistic (Cont'd) • When the DS has no more jobs, it releases the resources which go back to serving opportunistic jobs

  9. Dedicated Scheduling, and "Back-Filling” • There will always be "holes" in the dedicated schedule, sets of resources that can't be filled with dedicated jobs for certain periods of time • Traditional solution is “back-filling” the holes with smaller dedicated jobs • However, these might not be preemptable

  10. Back-Filling (Cont’d) • Instead of back-filling with dedicated jobs, we give the resources to Condor’s opportunistic scheduler • Condor runs preemptable opportunistic jobs until the DS decides it needs the resources again and reclaims them

  11. Dedicated Resources are Opportunistic Resources • Even “dedicated” resources are really opportunistic • Hardware failure, software failure, etc • Condor handles these failures better than traditional dedicated schedulers, since our system already deals with them after years of opportunistic scheduling experience

  12. How Does MPI Support in Condor Really Work? • Changes to the resource agent (condor_startd) • Changes to the job scheduling agent (condor_schedd) • Changes to the rest of the Condor system

  13. How Do You Make a Resource Dedicated in Condor? • Just have to change a few config file settings.... no new startd binary is required • Add an attribute to the classad saying which scheduler, if any, this resource is willing to become dedicated to

  14. Other Configuration Changes for the startd • In addition, you must change the policy expressions: • Must always be willing to run jobs from the DS • While the resource is claimed by the DS, the startd should never suspend or preempt jobs.

  15. Submitting Dedicated Jobs • Requires a new "contrib" version of the condor_schedd • Condor "wakes up" the dedicated scheduler logic inside the condor_schedd when MPI jobs are submitted

  16. How Does Your Job Get Resources? • The DS does a query to find all resources that are willing to become dedicated to it • DS sends out "resource request" classads and negotiates for resources with the negotiator (the opportunistic scheduler)

  17. How Does Your Job Get Resources? (Cont’d) • DS then claims resources directly • Once resources are available, the DS schedules and spawns jobs • When jobs complete, if more MPI jobs can be serviced with the same resources, the DS holds onto them and uses them immediately

  18. Changes to the rest of Condor? • Very few other changes required • Users can use all the same tools, interfaces, etc. • Just need a new condor_starter to actually spawn MPI jobs (will also be offered as a contrib module)

  19. Specific MPI Implementations • MPICH • LAM • Others?

  20. Condor and MPICH • Currently we support MPICH on Unix • Working on adding MPICH-NT support • NT’s MPICH has a different mechanism to spawn jobs than the Unix MPICH...

  21. Condor + LAM = "LAMdor” • LAM's API is better suited for a dynamic environment, where hosts can come and go from your MPI universe • Has a different mechanism for spawning jobs than MPICH • Condor working to support their methods for spawning

  22. LAMdor (Cont’d) • LAM working to understand, expand, and fully implement the dynamic scheduling calls in their API • LAM also considering using Condor’s libraries to support checkpointing of MPI computations

  23. MPI-2 Standard • The MPI-2 standard contains calls to handle dynamic resources • Not yet fully implemented by anyone • When it is, we'll support it

  24. Other MPI implementations • What are people using? • Do you want to see Condor support any other MPI implementations? • If so, send email to condor@cs.wisc.edu and let us know

  25. Future work • Implementing more advanced dedicated scheduling algorithms • Support for all sorts of MPI implementations (LAM, MPICH-NT, MPI-2, others)

  26. More Future work • Solving problems w/ MPI on the Grid • "Flocking" MPI jobs to remote pools, or even spanning pools with a single computation • Solving issues of resource ownership on the Grid (i.e. how do you handle multiple dedicated schedulers on the grid wanting to control a given resource?)

  27. More Future work • Checkpointing entire MPI computations • "MW" implmentation on top of Condor-MPI

  28. More Future work • Support for other kinds of dedicated jobs • Generic dedicated jobs (we just gather and schedule the resources, then call your program, give it the list of machines, and let the program spawn itself) • LINDA

  29. How do I start using MPI with Condor? • MPI support is still alpha, not quite ready for production use • A beta release should be out soon as a contrib module • Check the web site www.cs.wisc.edu/condor

  30. Thanks for Listening! • Questions? • For more information: • http://www.cs.wisc.edu/condor • mailto:condor@cs.wisc.edu

More Related