360 likes | 435 Views
JavaGrande Forum: An Overview. Vladimir Getov University of Westminster. Background Observations. Java thread model is insufficient Message Passing model is important to support Performance is critical Many applications need “high” performance Proper numerical computing
E N D
JavaGrande Forum: An Overview Vladimir Getov University of Westminster
Background Observations • Java thread model is insufficient • Message Passing model is important to support • Performance is critical • Many applications need “high” performance • Proper numerical computing • Complex, arrays, performance, reproducibility
Background • main motivation - need to solve bigger problems with resource requirements beyond the current limits • recent advances in computer communications make it possible to couple geographically distributed resources - Grid computing • in contrast with low-level approaches Java can support a single object-oriented communication framework for Grande applications
Java Grande Forum (JGF) • Goal:make Java the best everenvironment forgrande applications† • Open to all: .com, .edu, .gov • Established March 1998 • Conferences, working groups • http://www.javagrande.org/ † e.g. large-scale, large-distance, resource-”hungry”
Getting Better Performance • native methods (JNI) • stand-alone compliers (.java -> .exe) • modified JVMs • fused mult-adds, bypass array bounds checks • aggressive bytecode optimization • JITs, flash compilers, HotSpot • bytecode transformers • parallel Java and concurrency
SciMark Benchmark http://math.nist.gov/scimark/ • five key numerical kernels • fast Fourier transform • successive over-relaxation (SOR) • Monte Carlo integration • sparse matrix multiply • dense LU factorization • two sizes: in cache and out-of-cache • run as applet; Java, C source available • results in Mflop/s; posted on SciMark page
JVMs Are Improving SciMark : 333 MHz Sun Ultra 10
SciMark: C Beats Java Sun UltraSPARC 60, Sun JDK 1.3 (HotSpot) , javac -0; Sun cc -0; SunOS 5.7
SciMark: Java Beats C Intel PIII, 500 MHz, Win98, Sun JDK 1.2, javac -0; Microsoft VC++ 5.0, cl -0; Win98
Multidimensional arrays • In Java an “n-dimensional array” is equivalent to a one-dimensional array of (n - 1)-dimensional arrays. • In the proposal, message buffers are always one-dimensional arrays, but element type may be an object, which may have array type - hence multidimensional arrays can appear as message buffers.
Java multidimensional arrays Array of Arrays
Java multidimensional arrays Java multidimensional arrays are not indivisible objects: could have intra-array aliasing and "partial overlaps" with other arrays
Java in Distributed Computing • main motivation - need to solve bigger problems with resource requirements beyond the current limits • recent advances in computer communications make it possible to couple geographically distributed resources - Grid computing • in contrast with low-level approaches Java can support a single object-oriented communication framework for Grande applications
Message Passing - Motivation • The existing communication packages in Java - RMI, API to BSD sockets - are optimized for Client/Server programming • The symmetric model of communication is captured in the MPI standard - MPI-1 and MPI-2 • An MPI-like message-passing API specification is needed to enable the development of portable JavaGrande applications
Early MPI-like Efforts - 1 • mpiJava - Modeled after the C++ binding for MPI. Implementation through JNI wrappers to native MPI software. • JavaMPI - Automatic generation of wrappers to legacy MPI libraries. C-like implementation based on the JCI code generator. • MPIJ - Pure Java implementation of MPI closely based on the C++ binding. A large subset of MPI is implemented using native marshaling of primitive Java types.
Early MPI-like Efforts - 2 • JMPI - MPI Soft Tech Inc. have announced a commercial effort under way to develop a message passing environment for Java. • Others • Current ports - Linux, Solaris (both WS clusters and SMPs), AIX (both WS clusters and SP2), Windows NT clusters, Origin-2000, Fujitsu AP3000, and Hitachi SR2201. • Java + MPI codes - growing variety including full applications
MPJ API Specification • First phase in our work on Message Passing for Java. • Builds on MPI-1 Specification and the current Java Specification. • Immediate standardization for common message passing programs in Java • Basis for conversion between C, C++, Fortran and Java. • Eventually, support for aspects of MPI-2 as well as possible improvements to the Java language.
Naming Conventions • All MPI classes belong to the package mpi. • Conventions for capitalization, etc, in class and member namesgenerally follow the recommendations of Sun's Java code conventions • consistent with the MPI C++ binding
Error codes • Unlike the C and Fortran interfaces, the Java interfaces to MPIcalls will not return explicit error codes. • Instead, the Java exception mechanism will be used to report errors
O-O Java-Centric Message Passing • Current task - to offer a first principles study of MPI-like services in an upward compatible fashion • Goal - performance and portability • Fundamental look at data marshaling • Preference for Java-natural mechanisms
MPJ-Related Documents and URLs • E-discussion - java-mpi@csit.fsu.edu or e- mail v.s.getov@wmin.ac.uk • MPJ API Specification - http://www.javagrande.org/reports • mpiJava - MPJ reference implementation http://mailer.csit.fsu.edu/mailman/listinfo/java-mpi/
Java is a highly-portable language Java adheres to the “Write once, run anywhere” philosophy Java has a well-established collection of scientific library bindings Java’s executional speed is suitable for HPC C/Fortran are highly-portable languages C/Fortran adhere to the “Write once, run anywhere” philosophy C/Fortran have well-established scientific libraries C/Fortran executional speeds are suitable for HPC Mixed-Language Programming with Java
C/Fortran have well-established scientific library bindings C/Fortran executional speeds are suitable for HPC So, What Language to Use? Java is a highly-portable language Java adheres to the “Write once, run anywhere” philosophy Utilize Java for its portability and standardization, but focus on using Java as a wrapper for porting of native code in the form of shared libraries. This involves the least amount of work and guarantees maximum performance on different platforms.
Fast RMI - Motivation • Serialization and RMI are too slow for Grande Applications: • Improvements are needed in three areas: • Faster serialization for Java • Faster RMI for Java • Use of non-TCP/IP networks with RMI • JavaParty project works on all three areas
Faster Serialization: UKA-Serialization • Drop-in replacement (plus class file retrofitter) • Save 76%-96% of the time needed for serialization • Minor incompatibilities: • targeted towards fast communication, not made for persistent objects (store objects now and reload them in x years with some future release of Java) • not yet: remote loading of byte code • Some impact on Sun
Faster RMI: KaRMI • Drop-in replacement with almost the same API • Can exploit non-TCP/IP networks • Saves up to 96% of the time needed for a remote method invocation (including UKA serialization): 80ms on Digital Alphas connected by Myrinet • Minor incompatibilities: • no sockets & ports at user-level • no support of undocumented RMI classes • Some impact on Sun
Programming Models • SPMD • mutithreading on SMPs or clusters (fat JVMs) • symmetric message passing on clusters • Client-Server • RMI • JINI • Peer-to-Peer • JXTA and others • Mobile agents • Components