150 likes | 169 Views
Adaptive MPI is designed for highly dynamic parallel applications to address issues such as load imbalance and programming complexity. With features like virtual processors and automatic load balancing, it offers a seamless parallel programming experience on various computing platforms. This innovative approach allows for adaptive overlapping between computation and communication, enhancing performance and scalability. Ongoing work includes automatic checkpointing and performance prediction for robustness and optimization.
E N D
Adaptive MPI Chao Huang Parallel Programming Lab UIUC
Motivation • Issues • Highly dynamic parallel applications • Usually limited supercomputing platforms availability • Load imbalance and programming complexity • Little change to standard MPI program • Virtual processors • +vp option allows execution on desired number of virtual processors • Adaptive overlapping between computation and communication • Automatic load balancing Adaptive MPI
Outline • Motivation • Implementation • Features • Ongoing Work • Future Work Adaptive MPI
MPI processes MPI “processes” Implemented as virtual processes (user-level migratable threads) Real Processors AMPI: MPI with Virtualization • Each virtual process implemented as a user-level thread associated with a message-driven object Adaptive MPI
Virtualization • Basic idea • Virtual MPI processors mapped onto physical processors • Typically, # virtual processors > # processors • Advantages • Run program on any given number of processors • Adaptive overlapping computation and communication • Mapping strategy helps load balancing Adaptive MPI
Adaptive Overlapping Problem setup: Jacobi 3D problem size 2403 run on LeMieux. Run with virtualization ration 1 and 8. (p=8, vp=8 and 64) Adaptive MPI
Speedup Problem setup: Jacobi 3D problem size 2403 run on LeMieux. Shows AMPI with virtualization ratio of 1 and 8. Adaptive MPI
Virtualization Problem setup: Jacobi 3D problem size 2403 run on LeMieux. AMPI runs on any given # of PEs (eg 19, 33, 105), but native MPI needs cube #. Adaptive MPI
Load Balancing • Dynamic load balancing • Maps and re-maps objects as needed • Re-mapping strategies help adapt to dynamic variations • Load balancing by object migration: MPI_Migrate() • Collective call informing the load balancer that the thread is ready to be migrated, if needed • The load balancer migrates the objects • Packing, transferring and Unpacking (PUP) Adaptive MPI
Load Balancing Example AMR application Load balancer is activated at time steps 20, 40, 60, and 80. Adaptive MPI
Asynchronous Communications • Collective communications in MPI are complex and time consuming • MPI_Alltoall, etc • Implemented as blocking calls in MPI • Asynchronous calls enable overlapping computation with communication • Powered by communication optimization library MPI_Ialltoall(...) /* Some computation here */ MPI_Waitall(...) Adaptive MPI
Communication Optimization Alltoall time on 1K processors Adaptive MPI
Communication Optimization Alltoall CPU Overhead on 1K processors Adaptive MPI
Ongoing work • Automatic checkpointing • Improve robustness of large-scale applications • Performance prediction via direct simulation • Facilitate performance tuning w/o continuous access to large machine Adaptive MPI
Future Work • Support for visualization • Complete compliance to MPI-1.1 • Support for MPI-2 standard Adaptive MPI