1 / 9

Brad Whitlock October 14, 2009

Learn about the process of porting VisIt to the Blue Gene/P platform, improvements made, impact on performance, and future work planned for optimization.

gregb
Download Presentation

Brad Whitlock October 14, 2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Porting VisIt to BG/P Brad Whitlock October 14, 2009 www.vacet.org

  2. Overview • Objectives • Building 3rd party libraries • Building VisIt • Running VisIt on BG/P • Improvements • Impact • Future work www.vacet.org

  3. Objectives • Port VisIt to IBM’s BlueGene/P platform so VisIt can run on LLNL’s Dawn and eventually Sequoia • Dawn is a 500 Teraflop, 36,864 node, 147,456 cpu, IBM BG/P system • 4 850MHz PowerPC cores/node, 4Gb Memory/node • Compute nodes run CNK OS • Cross-compile code for CNK • Identify weaknesses in VisIt that prevent it from scaling to tens/hundreds of thousands of processors www.vacet.org

  4. Building 3rd party libraries • Built all libraries on login nodes for regular Linux PowerPC version of VisIt • Ran into runtime problems using xlC compiler so reverted to g++ for the time being • Cross-compiled all libraries for CNK • No support for this platform in VisIt’s 3rd party libraries so special builds were required • Mesa built unmangled and no X11 • VTK tricky to build • No OpenGL so VTK built with Mesa as its OpenGL • No X11 so created custom render window • Used CMake toolchain file www.vacet.org

  5. Building VisIt • No X11 so graphical components can’t be built for CNK (don’t build gui) • Added new --enable-engine-only build mode to VisIt’s build system that only builds the compute engine and its plugins • VisIt always used to require mangled mesa • This support had to become conditional on VTK having mangled mesa support www.vacet.org

  6. Running VisIt on Dawn • Dawn uses mpirun to start VisIt on compute nodes • Minor differences required environment variables to be exported via mpirun command, which could be handled via host profile in VisIt • VisIt ran at 1k,2k,4k,8k,16k nodes • VisIt ran with 1 and 4 trillion zone datasets (June09) • Encountered scaling problems early • Launch time slow because each processor was reading plugin directory to obtain plugin information • VisIt commands were sent from rank 0 to other ranks 1Kb at a time until a message was sent • Non-spinning bcast substitute used for sending commands had point-to-point that performed poorly at scale • Certain metadata consumed too much memory (each processor has ~700Mb only) • Synchronization step for SR mode used slow point-to-point www.vacet.org

  7. Improvements • Broadcast plugin information from rank 0 to other ranks to improve plugin loading time 9x • Broadcast VisIt commands from rank 0 in a single chunk instead of 1Kb at a time • Use standard bcast in engine main loop instead of poorly performing non-spin substitute geared towards shared nodes • Switched to alternate metadata representation to free up most available memory for calculations • Mark Miller was able to replace SR mode synchronization step with much faster version that reduced time to 2 seconds from 20 minutes www.vacet.org

  8. Impact • So far this project’s impact has been small for customers • They do not yet run on Dawn • They might not notice small improvements at today’s everyday processor counts (<2k) • At higher processor counts (>4k) optimizations added by this work prevent bottlenecks in compute engine, improving scalability www.vacet.org

  9. Future work • Resolve load problems with xlC compiler so we can use the best optimizations, including using BG/P’s dual FPU’s • Improve 3rd party library build process for BG/P by adding support in build_visit script • Continue profiling plots and improving performance • Reduce memory usage where possible • Investigate I/O patterns and attempt optimizations www.vacet.org

More Related