140 likes | 304 Views
Coscheduling in Clusters: Is it a Viable Alternative?. Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo, Chita R. Das Presented by: Richard Huang. Outline. Evaluation of scheduling alternatives Proposed HYBRID Coscheduling Evaluation Conclusions Discussion.
E N D
Coscheduling in Clusters: Is it a Viable Alternative? Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo, Chita R. Das Presented by: Richard Huang
Outline • Evaluation of scheduling alternatives • Proposed HYBRID Coscheduling • Evaluation • Conclusions • Discussion
Evaluation of Scheduling Alternatives • Local Scheduling • Processes of parallel job independently scheduled • Batch Scheduling • Most popular (Maui, PBS,etc.) • Avoid memory swapping, but low utilization and high completion time • Gang Scheduling • All processes of job (gang) scheduled together for simultaneous execution • Faster completion time, but global synchronization costs
Communication-Driven Coscheduling • Dynamic Coscheduling (DCS) • Uses incoming message to schedule processes for which messages are destined • Spin Block (SB) • Process waiting for message spins for fixed amount of time before blocking itself • Periodic Boost (PB) • Periodically boosts priority of process with un-consumed messages • Co-ordinated Coscheduling (CC) • Optimizes spinning time to improve performance at both sender and receiver
HYBRID Coscheduling • Idea: • Combines merits of both gang scheduling and communication-driven coscheduling • Coschedule ALL processes like gang scheduler • Boost process priority during communication phase • Issues: • How to differentiate between computation and communication phases? • How to ensure fairness during boosting?
HYBRID Coscheduling • Boost priority whenever parallel process enter collective communication phase • Immediate blocking used at sender and receiver
Evaluation • 16 node Linux cluster connected through 16-port Myrinet switch • 100 mixed applications from NAS • Two different job allocation • PACKING: contiguous nodes assigned to a job to reduce system fragmentation and increase system utilization • NO PACKING: parallel processes of job randomly allocated to available nodes in system
Observations • Average performance gain for PACKING about 20% compared to NO PACKING • Under high load, big differences due to waiting times • Under light load, difference in execution time more pronounced • Batch scheduler has lowest execution time, followed by HYBRID • HYBRID has lowest completion time among all scheduling schemes
Explanations • HYBRID avoids unnecessary spinning • process immediately blocked if communication operation is not complete • HYBRID reduces communication delay • process wake up immediately upon receipt of message (since its priority boosted) • HYBRID avoids interrupt overheads • Frequent interrupts from NIC to CPU to boost process’s priority in CC, DCS, and PB • HYBRID boosted only at beginning of an MPI collective communication • HYBRID avoids global synchronization overhead like gang scheduling • HYBRID follows implicit coscheduling
Other Results • Completing jobs faster can lead to energy savings by using dynamic voltage scaling or shutting down machines • Communication-driven coscheduling should deploy memory aware allocator to avoid expensive disk activities
Conclusions • Can get significant performance improvement by using coscheduling mechanisms like HYBRID, SB, or CC • Block-based scheduling techniques had better results because other processes in ready state can proceed • HYBRID scheme is best performer and can be easily implemented on any platform with only modification in the message passing layer • New techniques deployed on cluster should avoid expensive memory swapping • Improved efficiency in scheduling algorithm can translate to better performance-energy ratio
Discussion • Can it be true that blocking is always better than spinning? • How likely is it to move away from batch scheduling in clusters and super computers? • Do people try to save energy by improving scheduling algorithm?