1 / 14

Coscheduling in Clusters: Is it a Viable Alternative?

Coscheduling in Clusters: Is it a Viable Alternative?. Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo, Chita R. Das Presented by: Richard Huang. Outline. Evaluation of scheduling alternatives Proposed HYBRID Coscheduling Evaluation Conclusions Discussion.

abeni
Download Presentation

Coscheduling in Clusters: Is it a Viable Alternative?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Coscheduling in Clusters: Is it a Viable Alternative? Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo, Chita R. Das Presented by: Richard Huang

  2. Outline • Evaluation of scheduling alternatives • Proposed HYBRID Coscheduling • Evaluation • Conclusions • Discussion

  3. Evaluation of Scheduling Alternatives • Local Scheduling • Processes of parallel job independently scheduled • Batch Scheduling • Most popular (Maui, PBS,etc.) • Avoid memory swapping, but low utilization and high completion time • Gang Scheduling • All processes of job (gang) scheduled together for simultaneous execution • Faster completion time, but global synchronization costs

  4. Communication-Driven Coscheduling • Dynamic Coscheduling (DCS) • Uses incoming message to schedule processes for which messages are destined • Spin Block (SB) • Process waiting for message spins for fixed amount of time before blocking itself • Periodic Boost (PB) • Periodically boosts priority of process with un-consumed messages • Co-ordinated Coscheduling (CC) • Optimizes spinning time to improve performance at both sender and receiver

  5. HYBRID Coscheduling • Idea: • Combines merits of both gang scheduling and communication-driven coscheduling • Coschedule ALL processes like gang scheduler • Boost process priority during communication phase • Issues: • How to differentiate between computation and communication phases? • How to ensure fairness during boosting?

  6. HYBRID Coscheduling • Boost priority whenever parallel process enter collective communication phase • Immediate blocking used at sender and receiver

  7. Traditional and Generic Coscheduling Framework

  8. Evaluation • 16 node Linux cluster connected through 16-port Myrinet switch • 100 mixed applications from NAS • Two different job allocation • PACKING: contiguous nodes assigned to a job to reduce system fragmentation and increase system utilization • NO PACKING: parallel processes of job randomly allocated to available nodes in system

  9. Performance Comparison

  10. Observations • Average performance gain for PACKING about 20% compared to NO PACKING • Under high load, big differences due to waiting times • Under light load, difference in execution time more pronounced • Batch scheduler has lowest execution time, followed by HYBRID • HYBRID has lowest completion time among all scheduling schemes

  11. Explanations • HYBRID avoids unnecessary spinning • process immediately blocked if communication operation is not complete • HYBRID reduces communication delay • process wake up immediately upon receipt of message (since its priority boosted) • HYBRID avoids interrupt overheads • Frequent interrupts from NIC to CPU to boost process’s priority in CC, DCS, and PB • HYBRID boosted only at beginning of an MPI collective communication • HYBRID avoids global synchronization overhead like gang scheduling • HYBRID follows implicit coscheduling

  12. Other Results • Completing jobs faster can lead to energy savings by using dynamic voltage scaling or shutting down machines • Communication-driven coscheduling should deploy memory aware allocator to avoid expensive disk activities

  13. Conclusions • Can get significant performance improvement by using coscheduling mechanisms like HYBRID, SB, or CC • Block-based scheduling techniques had better results because other processes in ready state can proceed • HYBRID scheme is best performer and can be easily implemented on any platform with only modification in the message passing layer • New techniques deployed on cluster should avoid expensive memory swapping • Improved efficiency in scheduling algorithm can translate to better performance-energy ratio

  14. Discussion • Can it be true that blocking is always better than spinning? • How likely is it to move away from batch scheduling in clusters and super computers? • Do people try to save energy by improving scheduling algorithm?

More Related