110 likes | 119 Views
This study focuses on predicting MPI collective communication execution times on heterogeneous clusters using switched Ethernet. It analyzes the significant increase in execution times for medium-sized messages in many-to-one communications across different platforms and MPI implementations. The work proposes a model for many-to-one communications for medium-sized messages, based on empirical data and existing performance models. Moreover, it suggests a redesign of parallel applications based on the constructed many-to-one model. Supported by Science Foundation Ireland.
E N D
A Performance Model of Many-to-One Collective Communications for Parallel Computing Alexey Lastovetsky, Maureen O’Flynn UCD School of Computer Science and Informatics Belfield, Dublin 4, Ireland alexey.lastovetsky@ucd.ie, maureen.oflynn@ucd.ie
Objectives • Goal: prediction of the execution time of MPI collective communications on a heterogeneous cluster based on a switched Ethernet • Background: performance models of single point-to-point, simultaneous independent point-to-point, one-to-many communications • Observation: a significant increase in the execution time of many-to-one communication for medium-sized messages on all platforms and MPI implementations • Problem: to model many-to-one communications for medium-sized messages
Performance models for point-to-pointand one-to-many communications • point-to-point: - execution time - message size - fixed delays - variable delays - transmission rate • one-to-many:
Many-to-one collective communications:non-linear and non-deterministic escalations 0.250.2 Seconds 0.04 .00001 0 10 20 30 40 50 60 70 80 90 Message size in KB
T3 T2 T1 M1 MC M2 Parameters of many-to-one modelfor medium-sized messages message size where escalations begin message size where escalations stop occuring message size from which escalations occur with 100% certainty probabilities of escalations
Probability of escalation • Discrete constant levels of escalation of values of 40, 200 and 250 times • Probability of escalation to level is found
Many-to-one model for small messages Many-to-One Model for Small Messages
M2 Mc M1 n1 n2 Multi-spectral satellite application • A typical real-time satellite imaging application (512x512 bytes) • A sequence of raw data images divided into partitions for parallel processing by a cluster
Redesigning Application • Calculate the number of sub-partitions m of a partition of the medium size M so that: • Replace MPI_Gatherwith sequence of MPI_Gatherfor smaller messages
Conclusion • Results • previously undocumented non-linear non-deterministic behaviour for medium messages is analysed • many-to-one model is built on the empirical data and point-to-point model • parallel application is redesigned in accordance with many-to-one model The work was supported by Science Foundation Ireland.