1 / 9

AstroBEAR Parallelization Options

AstroBEAR Parallelization Options. Areas With Room For Improvement. Ghost Zone Resolution MPI Load-Balancing Re-Gridding Algorithm Upgrading MPI Library. Ghost Zone Resolution. Can exceed 30% of total program execution time. Affects fixed grid as well as AMR

Download Presentation

AstroBEAR Parallelization Options

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AstroBEAR Parallelization Options

  2. Areas With Room For Improvement • Ghost Zone Resolution • MPI Load-Balancing • Re-Gridding Algorithm • Upgrading MPI Library

  3. Ghost Zone Resolution • Can exceed 30% of total program execution time. • Affects fixed grid as well as AMR • For runs using >2 processors, 98-99% of ghost zone execution time is MPI processing.

  4. Ghost Zone Resolution Options • Duplex Transmission • Old version swaps ghost zone data serially between two processors. • Duplex transmission would have the two processors handle sending, receiving and copying concurrently. Pros: • Reduces the amount of duplicated overhead. • Makes more efficient use of worker processors. Cons: • Little reduction in the amount of MPI overhead. • Still has a high computation cost relative to the number of nodes. Status: In progress

  5. Alternate option: Ghost Zone broadcast • Use the MPI Broadcast routines to have a grid send all its ghost zones to its neighbors at once, who then process that data and broadcast their own ghost zones when it is their turn. • Pros: • Eliminates need for pairwise iteration over level (i.e., transfer would only be done once per grid). • Cons: • Potential congestion if all a grid’s neighbors are on the same processor. • No guarantee that it’s an improvement over pairwise duplex transmission. • Status: Speculative

  6. Load Balancing • Does it need to be done as often? • Ramses code only rebalances every ten frames. • Re-gridding happens locally as usual, but it is assumed that the AMR structure does not change enough between two iterations to warrant a load-rebalance. Pros: • Significant reduction in MPI overhead (BalanceLoads() gets called a lot). • Non-MPI overhead will likely be reduced as well, as the current load-balancing scheme recalculates the load across the entire Forest. Cons: • “patch-based AMR” vs. “tree-based AMR”; can it be adapted to AstroBEAR? • Requires implementation of some Hilbert-space algorithm—how complex/computationally intensive? Status: Speculative

  7. Re-Gridding Parallelization • Parallelization of re-gridding is handled using MPI and OpenMP • Problem: MPI-1 limits thread usage • Only one thread for the worker processors and two for the master processor. • Only one thread on each processor is MPI-capable. • Performance bottlenecks happen if one processor gets tied up.

  8. Advantage of Multiple Threads MPI with OpenMP, single thread MPI with OpenMP, multi-thread 0 1 0 1 2 3 2 3

  9. Unfortunately... • LAM MPI is not thread-safe. • You can write multi-threaded applications using LAM MPI, but it is explicitly not thread-safe and so we would be responsible for maintaining MPI exclusion. • In a collaborative development environment like AstroBEAR, this is a bad idea. • LAM is making noise about supporting this eventually, but they're not there yet. • Alternatives: • Improve efficiency of pairwise message passing. • Offload more re-gridding computation to worker processors. Status: We're looking at it.

More Related