250 likes | 257 Views
This presentation explores the buffer management approach in Java High Performance Computing (HPC) messaging, focusing on the MPJ Express implementation. It includes performance evaluation and a use case from computational cosmology.
E N D
An Approach to Buffer Management in Java HPC Messaging Mark Baker, Bryan Carpenter, and Aamir Shafi Distributed Systems Group http://dsg.port.ac.uk
Presentation outline • Introduction, • The Buffering Layer in MPJ Express, • Performance Evaluation, • A Use Case from Computational Cosmology, • Conclusions.
Introduction • “Traditional” language based approach to high level parallel programming: • HPF, UPC, Co-Array Fortran etc. • Recently practical parallel programming has focused on commodity languages supplemented by libraries like MPI: • C, C++, Fortran …, • Cost-effective. • Therefore, practical approach to “raise the level” of parallel programming: • Use an advanced commodity language.
Introduction • There are several arguments for using Java for scientific computing, including: • Portability, • Compile-time and runtime safety, • Built-in multi-threading, • Rapid development, • Built-in libraries, • Performance via JIT-compilers. • MPI was finalized in June 1994 as a standard message passing API for technical parallel computing: • ‘Java Grande Message Passing Workgroup’ defined Java bindings in 98.
Introduction • MPJ Express is a high quality implementation of a Java messaging system, based on Java bindings: • Released September 2005, • Thread-safe communication devices for TCP and Myrinet, • Implements derived datatypes, communicators, and virtual topologies, • Runtime system for portable bootstrapping, • Web: http://dsg.port.ac.uk/projects/mpj
Introduction: Java NIO • Java New I/O: • Non-blocking communication, • Introduces direct and indirect ByteBuffers: • Direct byte buffer reside in native OS memory, unlike normal Java objects, • The creation of direct byte buffer is costly but provide faster I/O.
Introduction: The Buffering Layer in MPJ Express • MPJ Express requires a buffering layer: • To use Java NIO: • It is possible to read and write data to and from byte buffers onto the wire. • To use proprietary networks like Myrinet efficiently: • Direct byte buffers have memory pointers that can be used for Direct Memory Access (DMA) transfers, • Avoid JNI overheads: • No data-copying between JVM and native OS memory. • It incurs an additional copying overhead though.
Presentation outline • Introduction, • The Buffering Layer in MPJ Express, • Performance Evaluation, • A Use Case from Computational Cosmology, • Conclusions.
Buffering Layer • An extendable buffering layer (mpjbuf): • Supports higher and lower levels of MPJ Express, • Various implementations based on actual storage medium: • Direct or indirect ByteBuffers, • Arrays, • Native C memory. • An mpjbuf buffer object consists of: • A static buffer to store primitive datatypes, • A dynamic buffer to store serialized Java objects. • Creating ByteBuffers on the fly is costly: • Memory management is based on Knuth’s buddy algorithm, • Two implementations: • Buddy1 - store offset and smaller memory footprint, • Buddy2 - store objects and bigger memory footprint.
Presentation outline • Introduction, • The Buffering Layer in MPJ Express, • Performance Evaluation: • MPJ Express compared to MPICH and mpiJava on Fast Ethernet and Myrinet, • A Use Case from Computational Cosmology, • Conclusions.
Presentation outline • Introduction, • The Buffering Layer in MPJ Express, • Performance Evaluation, • A Use Case from Computational Cosmology, • Conclusions.
Gadget-2 • An experiment to understand Java’s performance. • Gadget-2 is a massively parallel structure formation cosmological simulations that is written in C: • Simulates the evolution of large systems under the influence of gravitational and hydrodynamic forces, • A version was used in “Millenium Simulation” that evolves 10^10 dark matter particles from the early Universe to the current day: • Executed on 512 nodes using a Terabyte of distributed memory, • Used 350,000 CPU hours over 28 days of elapsed time. • Uses Barnes-hut tree algorithm for calculating gravitational forces, • Domain decomposition based on Peano-Hilbert space filling curve. • We produced a Java version of Gadget-2 using MPJ Express for messaging. • Benchmarking comparison for “Colliding Galaxies”.
Presentation outline • Introduction, • The Buffering Layer in MPJ Express, • Performance Evaluation, • A Use Case from Computational Cosmology, • Conclusions.
Summary • MPJ Express is becoming a production-quality Java messaging system: • Communication devices for TCP and Myrinet. • MPJ Express relies on an intermediate buffering layer: • Avoid JNI overheads, • Possible to use direct ByteBuffers, • Memory management using Knuth’s buddy algorithm. • Java version of Gadget-2: • A use-case from computational cosmology.
Conclusions • We have shown that Java has the potential to be a good HPC language. • The additional overhead of copying can be avoided by extending the MPJ API: • Support for sending from and receiving to ByteBuffers. • Future Work: • Exploit thread-safety of MPJ Express by using OpenMP parallelism, • Next release with Myrinet device in the third quarter of 2006.