1 / 23

Processor Co-Allocation in Multicluster Systems

Processor Co-Allocation in Multicluster Systems. DAS-2 Workshop Amsterdam June 6, 2002 Anca Bucur and Dick Epema Parallel and Distributed Systems Group Delft University of Technology. Introduction (1).

Download Presentation

Processor Co-Allocation in Multicluster Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Processor Co-Allocation in Multicluster Systems DAS-2 Workshop Amsterdam June 6, 2002 Anca Bucur and Dick Epema Parallel and Distributed Systems Group Delft University of Technology D.H.J. Epema/PDS/TUD

  2. Introduction (1) • In multicluster systems (like the DAS, in GRIDs), jobs may use co-allocation (i.e., span multiple clusters): • to use available capacity • to process geographically spread data • Single-application performance issues: • application restructuring • wide-area runtime systems (e.g., optimize collective communication operations) • Multiple-application performance issues: • design/analyze scheduling policies • minimize response time, maximize maximal utilization D.H.J. Epema/PDS/TUD

  3. Introduction (2): Example • In april 2001, the Cactus Computational Toolkit was used for four-hour astrophysics simulations involving Einstein’s General Relativity equations • Equipment: • At NCSA: 480 CPUs of three SGI Origin2000 systems • At SDSC: 1020 CPUs of Blue Horizon • OC-12 622-Mbit/s network D.H.J. Epema/PDS/TUD

  4. Introduction (3): Problems 1 job: 2 3 cluster 3 fits with if flexible cluster 2 processors (pattern: idle) fits with if unordered cluster 1 time D.H.J. Epema/PDS/TUD

  5. System Model • Multicluster system consisting of clusters of processors of equal speed • Communication speed ratio : the ratio of the wide-area and local message transfer times …. D.H.J. Epema/PDS/TUD

  6. Job Components • A job consists of job components that each go to a single cluster, one task per processor • Distributions of job-component sizes: • Uniform: U[a,b] • Truncated and adapted geometric (favors small sizes and powers of 2): D(q) on [1,b] …. system job …. D.H.J. Epema/PDS/TUD

  7. Job Request Types (1) • Ordered and unordered requests specify their job-component sizes: Ordered: Unordered: ? …. …. …. …. D.H.J. Epema/PDS/TUD

  8. Job Request Types (2) • Flexible and total requests only specify the total number of processors needed: flexible: total: ? …. D.H.J. Epema/PDS/TUD

  9. Fitting a Job (1) • It is clear when an ordered or a total request fits • For an unordered request: • order components according to decreasing sizes • use First-Fit (FF) or Worst-Fit (WF) .… job WF idle system …. in use D.H.J. Epema/PDS/TUD

  10. Fitting a Job (2) • For a flexible request: • determine minimal number of clusters needed • fill least-loaded clusters (CF) completely, or balance load (LB) (variation: LB-A) CF LB idle job in use D.H.J. Epema/PDS/TUD

  11. Scheduling Policies • First Come First Served • Fit Processors First Served: search queue for jobs that fit job queue system …. …. …. D.H.J. Epema/PDS/TUD

  12. Interarrival/Service Times • Poisson arrival process in simulations • All tasks in a job have the same service time • Service-time distributions used: • Deterministic (mean 1) • Exponential (mean 1) • Hyperexponential (mean 1, coeff. of var. 3) • Derived from the DAS D.H.J. Epema/PDS/TUD

  13. Communication • We model jobs without and with communication • With communication: • tasks alternate between compute and communication phases • communication phase: all-to-all personalized communication • time for a single local synchronous message send operation: 0.001 • communication speed ratios considered: 1-100 D.H.J. Epema/PDS/TUD

  14. Single-cluster DAS Statistics service time number of jobs number of jobs nodes requested mean: 23.34 coeff. of var.: 1.11 mean: 356.45 (62.66) coeff. of var.: 5.37 D.H.J. Epema/PDS/TUD

  15. Performance Evaluation • Parameters we vary: • job request structure • job-component-size distribution • service-time distribution • number and sizes of clusters (base case: 4x32) • placement of unordered and flexible jobs • scheduling policy • communication speed ratio • co-allocation versus no co-allocation • queueing structure (global/local) • Performance metrics: • mean response time (only simulation) • maximal utilization (analysis and simulation) D.H.J. Epema/PDS/TUD

  16. Influence of Structure and Size ordered response time response time unordered total response time utilization utilization D.H.J. Epema/PDS/TUD

  17. Influence of Communication Speed Ratio utilization response time response time 10 100 utilization Right to left: total, flexible, unordered, ordered D.H.J. Epema/PDS/TUD

  18. Co-Allocation versus no Co-Alloc. (1) flexible 2 components 4 components 1 component utilization • no communication • unordered jobs • job size: • 4xD(0.9) on [1,8] • (fits on a single • cluster) response time D.H.J. Epema/PDS/TUD

  19. Co-allocation versus no Co-alloc. (2) utilization LB-A, ratio 5 LB-A, ratio 50 no co-allocation, FF • communication • flexible jobs • job size: • 4xD(0.9) on [1,8] response time D.H.J. Epema/PDS/TUD

  20. An Application on the DAS (1) • Solves the Poisson equation with a red-black Gauss-Seidel scheme • Measurements on the DAS (times in ms): • Time for diffusing local errors and computing the global error: 14 ms D.H.J. Epema/PDS/TUD

  21. An Application on the DAS (2) total ordered utilization response time Equal mix of jobs of sizes (2,2,2,2) and (4,4,4,4) D.H.J. Epema/PDS/TUD

  22. Maximal Utilization (1) • Assume: constant backlog, ordered jobs, exponential service (no communication) • Consider: the joint probability distribution of the sizes of jobs in the system • Result: this distribution is the same • when the system runs for a long time • when the system is filled from the empty state • Use the convolution of the job-size distribution to determine the distribution of the numbers of jobs in the system • Compute the maximal utilization D.H.J. Epema/PDS/TUD

  23. Maximal Utilization (2) • We have an approximation for the maximal utilization for unordered jobs with WF • We use simulations to validate this approximation • Capacity loss (1-max. util.) for 4 clusters of size 32, uniform job-component sizes: D.H.J. Epema/PDS/TUD

More Related