1 / 36

Java Grande Update javagrande

Explore the evolution of Java Grande since 1996, its role in HPCC, and the motivation to improve its performance and usability in diverse computing fields.

rjohnny
Download Presentation

Java Grande Update javagrande

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sun Labs December 4 2002 Java Grande Updatehttp://www.javagrande.org PTLIU Laboratory for Community Grids Geoffrey Fox Computer Science, Informatics, Physics Indiana University, Bloomington IN 47404 (Technology Officer, Anabas Corporation, San Jose) http://grids.ucs.indiana.edu/ptliupages gcf@indiana.edu uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  2. Java Grande in a Nutshell • Concept started in December 1996 with first meeting on Java for Science and Engineering • Forum established in February 1998 • Multiple forum activities in numerics, message-passing and parallel/distributed systems • Ongoing set of workshops sponsored by ACM • Bill Joy talked in 2000, Guy Steele in 2001 • Multiple useful Web Sites and papers/presentations • JSR activities with probably insufficient momentum • No institutional contact with Sun for 2 years • No impressive support for Java on HPC machines with relevant research compilers not productized uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  3. Java Grande Concept • Use ofJavafor “Performance” and “Usability” in: • High Performance Network Computing • Scientific and Engineering Computation • (Distributed) Modeling and Simulation • Parallel and Distributed Computing • Data Intensive Computing • HPCC • Computational Grids • The above is classic “small” technical computing area. There is a much larger Grande problem: • CommunicationandComputing Intensive CommercialApplications • Large scale Enterprise Software (iPlanet J2EE etc.) uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  4. Java Grande Motivation I: Users • We have rather different drivers from HPCC (parallel computing) and Enterprise Systems • In Enterprise software, we have Java as well established but architectures new (J2EE and messages) so new performance and scaling issues (Enterprise systems are large as in Grids/Autonomic computing) • In HPCC we failed to produce good computing environments in HPCC Initiative and there is a possibly serious gap between field (use of Fortran/C/C++) and next generation of potential Science and Engineering Users (Java, C#, Python ….) • Opportunity to deliver on high productivity HPCC environments uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  5. Java Motivation II: Language • The Java Language has several good design features • secure, safe (wrt bugs), object-oriented, familiar (to C C++ and even Fortran programmers) • Java has a very good set of libraries covering everything from commerce, multimedia, images to math functions (under development at http://math.nist.gov/javanumerics) • Java has best available electronic and paper training resources • Java has excellent integrated program development environments • Java naturally integrated with network and universal machine supports potentially powerful “write once-run anywhere” model • There is a large and growing trained labor force uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  6. Java Grande Forum • Group meet either at annual meeting or separately • Forum coordinated by Fox • Numerics Group led by Boisvert and Pozo • The Concurrency and Applications (Benchmarks) Group led by Caromel and Gannon • MPI subgroup led by Getov • Annual ACM sponsored workshops were in Bay area just before JavaOne upto 2001 • In 1999 merged with ISCOPE (Object Methods in Scientific Computing e.g. C++) but JG dominates • 2002 held just before OOPSLA with 90 attendees and good quality papers (peak attendance some 220) • No meeting planned for 2003 uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  7. JG Workshop 2002 I • KEYNOTE: Pratap Pattnaik, IBM, Autonomic Computing • Session II Grid and Parallel Computing • The Ninf Portal: An Automatic Generation Tool for Computing Portals • JavaSymphony: New Directives to Control and Synchronize Locality, Parallelism, and Load Balancing for Cluster and GRID-Computing • Ibis: an Efficient Java-based Grid Programming Environment • Efficient, Flexible and Typed Group Communications for Java • JOPI: A Java Object-Passing Interface • Session III Grid and Peer-to-peer Computing • Abstracting Remote Object Interaction in a Peer-2-Peer Environment • Advanced Eager Scheduling for Java-Based Adaptively Parallel Computing • A Scaleable Event Infrastructure for Peer to Peer Grids • Session IV Java Compilation • Elimination of Java Array Bounds Checks in the Presence of Indirection • Simple and Effective Array Prefetching in Java • Fast Subtype Checking in the HotSpot JVM • Almost-whole-program Compilation uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  8. JG Workshop 2002 II • Session V Object-based Computing • KEYNOTE: Alexander Stepanov, The Future of Abstraction • Generic Programming for High Performance Scientific Applications • Session VI Object-based Computing and Applications • Higher-Order Functions and Partial Applications for a C++ Skeleton Library • Ravenscar-Java: A High Integrity Profile for Real-Time Java • Parsek: Object Oriented Particle in Cell. Implementation and Performance Issues • inAspect - Interfacing Java and VSIPL • Session VII Node Java I • Open Runtime Platform: Flexibility with Performance Using Interfaces • Aggressive Object Combining • Run-time Evaluation of Opportunities for Object Inlining in Java • Session IV Node Java II • Jeeg: A Programming Language for Concurrent Objects Synchronization • Specifying Java Thread Semantics Using a Uniform Memory Model • Immutability Specification and its Applications • Adding Tuples to Java: a Study in Lightweight Data Structures uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  9. Disappointing Comment • I have not seen strong interest from HPCC users and HPCC purchasers in Java • Possibly Chicken and Egg situation .. • 2 years ago, Sun offered poor Java support on HPC • Not certain current situation • IBM Research produced several interesting HPC compilers supporting for example high performance arrays • These were not I think offered on IBM HPC machines • However people voting on this are not from the Internet generation and the “alternatives” are not good! • However one of largest pure Java science applications is from Los Alamos – CartaBlanca for heat transfer and multiphase fluid flow uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  10. Types of Activity • Java on Node • Compilers and Language issues • Parallel Computing • Thread and Message-passing models • Very little academic work for any languages! • Distributed Computing • RMI • Jini JXTA • Grid and Web Services • High performance enterprise Java • Combines Node and distributed computing issues uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  11. Java on the Node • Numerics subgroup of Java Grande Forum focused on “node issues” • Floating Point • Java Math libraries • Arrays – efficiency of >1D arrays and support of Fortran90 style array functions • Convenience and natural syntax – complex arithmetic notation and multi-type libraries • Very good SCIMARK node kernel benchmark at http://math.nist.gov/javanumerics/ • Broader range of benchmarks at http://www.epcc.ed.ac.uk/javagrande/ • Typical compiler work (from IBM) http://www.research.ibm.com/ninja/ uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  12. Scimark • http://math.nist.gov/scimark2/ • FFT • SOR • Monte Carlo • Sparse Matrix Multiply • Dense LU • Available as downloadable applet • Today peak performance is IBM VM on OS/2 at 380 megaflops average uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  13. Edinburgh Benchmark Set I • http://www.epcc.ed.ac.uk/javagrande/index_1.html • Sequential, multi-threaded, mpiJava, C versus Java • Low-Level • Arith: execution of arithmetic operations • Assign: variable assignment • Cast: casting • Create: creating objects and arrays • Loop: Loop overheads • Math: execution of math library operations • Method: method invocation • Serial: Serialisation • Exception: Exception handling uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  14. Edinburgh Benchmark Set II • Medium Size • Series: Fourier coefficient analysis • LUFact: LU Factorisation • SOR: Successive over-relaxation • HeapSort: Integer sorting • Crypt: IDEA encryption • FFT: FFT • Sparse: Sparse Matrix multiplication • “Real” Application • Search: Alpha-beta pruned search • Euler: Computational Fluid Dynamics • MD: Molecular Dynamics simulation • MC: Monte Carlo simulation • Ray Tracer: 3D Ray Tracer uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  15. Java Performance I uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  16. Java Performance II uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  17. Numerics I • Initially focused on “Java floating rules” that guaranteed same (bad) result on all processors • strictfp: This has been a part of Java for some time now. It is a keyword that specifies that the original strict (slow) semantics for Java floating point should be followed. • The new default allows 15-bit exponents for anonymous (temporary) variables. This tiny change allows Java implementations on the x86 family or processors to run at (nearly) full speed. • Also in default mode the specification of elementary functions is relaxed to allow any result within one unit in the last place of the correctly rounded exact results. This allows more efficient algorithms to be used (including hardware sin/cos). • There is a separate java.lang.StrictMath library that has a specific implementation of the functions that produces the exact same results on all machines. One must call the strict version explicitly to get the slower but certain result. uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  18. Some more on Numerics • fastfp modifier: There was a JSR for this that was withdrawn. Obvious goals include support for fused multiply-add. • Mark Snir was the lead and IBM could not find a replacement, so this is not being pursued. • At some point we'd like to resubmit. We are hoping that Joe Darcy would be the lead. • You can see info on fastfp at http://math.nist.gov/javanumerics/reports/jgfnwg-minutes-6-00.html • Joe Darcy: Joe has the title of Java Floating-Point Czar (it actually says this on his business card). • Joe is working on floating-point issues within Sun and serves as our main technical contact now. • He has proposed the inclusion of additional methods in java.lang.Math with the goal of making this library on par with the C math library (libm). uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  19. Numerics III • http://math.nist.gov/javanumerics/reports/jgfnwg-minutes-11-02.html November 2 2002 Update • True multidimensional arrays indexed using specialized notation. This is JSR83 • Operator overloading to support the easy expression of alternate arithmetics. • Complex numbers that are as efficient as primitive types. • A new floating-point mode (i.e., fastfp) that admits the use of fused multiply-add operations in Java, and possibly admits additional compiler optimizations, such as the use of associativity. • Expect to meet every 6 months • Note http://www.vni.com/jmsl/ Java Math Library uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  20. Java and Parallelism • Message passing Systems • mpiJava from Community Grids Lab “oldie but goodie” • Pure Java version MPJ planned but not implemented (well) • OpenMP in Java • JOMP from Edinburgh has its version of JavaGrande benchmarks • http://www.epcc.ed.ac.uk/computing/research_activities/JOMP/index_1.html • Thread and RMI based libraries • http://www.ipd.uka.de/JavaParty/JavaParty from Michael Phillipsen (active in JG Forum) also has • Optimized RMI (KaRMI) uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  21. HPJava • Conceived as a language for parallel programming, especially suitable for massively parallel, distributed memory computers. • Takes various ideas (hopefully the good ones) from High Performance Fortran ̶ distributed array model, parallel constructs. • Butin many respects HPJava is a lower levelparallel programming language than HPF (takes the best of MPI and HPF style programming models) • Explicitly SPMD, requiring parallel programmer to insert calls to collective communication libraries like MPI or Adlib (library developed originally to support general distributed memory parallel compilers) • More or less as a by-product, HPJava also has a useful “sequential” subset, that just adds scientific multidimensional arrays(à la Fortran 90)to Java(c.f. Java Grande numerics). • http://www.hpjava.org uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  22. HPspmd Model • HPJavawasoriginally intended as a first demonstration of a parallel programming model we called theHPspmd model. (Single Program Multiple Data) • Java was chosen as the base language for this demo (instead of Fortran 90 or C++) partly because of JavaGrande philosophy – we expected Java to be a more productive high performance computing environment • Actually it took so long to finish the HPJava preprocessor that in the mean timeJava has become comparable in speed with those languages. • Because HPJava uses standard JVMs, it leverages all the progress with Java. uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  23. An HPF-like Program in HPJava Procs p = new Procs2(P, P) ; // Declare 2d group of processes on(p) { // Enclosed code executed by that group. Range x = new ExtBlockRange(N, p.dim(0), 1, 1) ; // Distributed index ranges… in this Range y = new ExtBlockRange(N, p.dim(1), 1, 1) ; // case extended with ghostregions. float [[-,-]] u = new float [[x, y]] ; // A distributed array for(int iter = 0 ; iter < NITER ; iter++) { Adlib.writeHalo(u) ; // Communication – edge exchange overall(i = x for 1 : N - 2) // Distributed, parallel looping construct overall(j = y for 1 + (i` + iter) % 2 : N - 2 : 2) u [i, j] = 0.25 * (u [i - 1, j] + u [i + 1, j] + u [i, j - 1] + u [i, j + 1]) ; } } uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  24. HPJava vs HPF • This HPJava program looks like HPF, but the programming model is one of multiple, interacting processes, or threads • “Loosely synchronous”, not HPF single-threaded semantics. • We invoke the communication library to update the ghost regions in the array explicitly. • But because we have high-level collective libraries, this isn’t particularly onerous. • Can “break out” of the collective mindset at any time, and resort to low-level MIMD node processing, and/or message exchange, if algorithm demands it (which is quite common). uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  25. Benefits of HPspmd Model • Translators are much easier to implement than full parallel compilers. No compiler magic needed and inherit immediately features of best “standard compilers”. • The current HPJava compiler is just a preprocessor converting to standard Java, using a simple translation scheme with essentially no optimization. But performance is not embarrassing (see later). • Of course later we can do optimizations, and (hand-coding suggests) improve performance significantly. • Good (object-oriented) framework for developing specialized parallel libraries. • HPspmd designed to have “ease of writing” of HPF but allow clearer control of parallel implementation for somebody who understood parallel algorithm • HPF criticized by some as too automatic uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  26. Full HPJava (Group, Range, on, overall,…) Sequential Java with Multiarrays int[[*,*]] Translator Java Source-to-Source Translation Libraries Adlib OOMPH¹ MPJ¹ mpjdev Jini¹ Native MPI HPJava Architecture (¹Not yet Implemented) uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  27. HPJava Preprocessor Features • Input language is a strict extension of Java 2. • Multi-arrays are translated into conventional Java 1D arrays • Front-end implements allcompile-time checks required by the Java Language Spec (currently testing against Jacks suite). • Goal: if the preprocessor accepts the source, it never outputs a program the javac back-end will reject. • Carefully preserves line numbering, so run-time exception messages usually point accurately back into original HPJava source code – makes debugging HPJava “easy”. • Full source of preprocessor + libraries will be placed in public domain. No license agreement (We expect.) • Good framework for other experiments with Java language extensions… • Release, very soon, at http://www.hpjava.org. uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  28. Libraries • Adlib is a comprehensive library for collective operations on distributed arrays, implementing operations like reductions, shifts, edge exchange for stencil updates, etc. • Invoked like MPI, but higher level. • Originally implemented to support HPF translation (shpf, PCRC projects). • Originally C++, now Java, implemented on mpjdev portability layer. • MPJ is a proposed Java binding of standard MPI • OOMPH is an envisaged HPJava binding of MPI-level operations, taking advantage of multiarrays to simplify the API. uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  29. Low-Level Messaging for HPC • mpiJava is our own binding of MPI for Java. Implemented as native method wrappers for “real” MPI (MPICH, Sun HPC and IBM MPI’s) • Several other groups developed similar APIs, but mpiJava is probably most cited today. Still maintained …. • Pugh claims Java’s new I/O very fast • Distinguish from “MPI on the Grid” MPICH-G2 • MPJ was put forward as a unified “standard” by a small group (including Vladimir Getov, Tony Skjellum and Carpenter from Indiana) but activity appears to be dormant. • API is quite large and inherits some ugly features from MPI and mpiJava. A smaller, more focused, OOMPH API might be more attractive. • Note MPI datatype’s are very unsuitable for object-based languages • Sun HPC interested in MPJ for a while uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  30. Level of Interest in mpiJava? • Slow but steady uptake of our mpiJava software over four years: uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  31. Example HPJava Benchmarks on IBM SP3 • Results gotten December 3 2003 and not yet clear! • Speed-up not scaled speed-up uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  32. Distributed Computing • Much of the work of Forum was in distributed computing • It included several pure Java frameworks such as those from European groups (Bal, Caromel, Phillipsen) • Forum initially focused on fast RMI and a “Java Framework for Computing” or Java Computing Portals • Most workers in this field now position research in a Grid context • RMI becomes GridRPC • Portals become Grid Computing Environments (co-chaired by Fox and Gannon to reflect JG Heritage) uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  33. Background on Indiana Community GridsLaboratory Research Geoffrey Fox, Director http://grids.ucs.indiana.edu/ptliupages/ uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  34. 6 Activity Areas in CG Laboratory I • HPJava: Parallel Programming with Java • MPI and HPF Style Programming in Java (multi-arrays) • http://www.hpjava.org • Build on this for “HPSearch” with Java bound to “Grid/Web/XPATH/Google” handles • Available Dec. 2002; mpiJava available for 3 years • NaradaBrokering Publish/Subscribe Distributed Event/Message System • http://www.naradabrokering.org • “MQSeries/JMS” P/S applied to Collaboration, Grid, P2P(JXTA) • Supports UDP, TCP/IP, Firewalls (actual transport  user call) • Used in other projects: Collaboration, Portal and Handheld uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  35. 6 Activity Areas in CG Laboratory II • Online Knowledge Center DoD HPCC Support Portal http://ptlportal.communitygrids.iu.edu/portal/ • Portal, Database, XML Metadata Tools • Jetspeed and portlet architecture • http://www.xmlnuggets.org is “email group” interface for browsing multiple instances of a Schema (also XML based news groups) • Schema wizard gives general user interfaces for each Schema • Gateway Computing Portal • DoD HPCMO, Geoscience, (Bioinformatics, particle physics) applications • Web Service based (originally CORBA) • Kerberos, SAML Security, GCE Shell – 70 functions • Integrate data and compute Grids • http://www.gatewayportal.org/ and http://www.servogrid.org/ • Portlets in NCSA Alliance Portal uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

  36. 6 Activity Areas in CG Laboratory IIIComponents of an Education Grid • Anabas provides base JMS based collaborative e-learning service (Fox co-founded 2 years ago) • Collaboration as a Web Service • General XGSP specification of Collaborative session capturing H323 SIP JXTA • Audio-Video conferencing as a web service – Admire (Beihang), Access Grid, VOIP, Polycom, Desktop USB • Move all tools and shared applications (Word, PowerPoint) • General Scheme to make WS’s collaborative using NB • Carousel HandHeld Collaborative Environments • iPAQ running Savaje Java OS linked to PC’s; adding cell-phone/PDA tandems • SVG as a Web Service demonstrated • Universal Access • http://grids.ucs.indiana.edu/ptliupages/projects/carousel/ uri="http://ptlportal.ucs.indiana.edu/" email="gcf@indiana.edu"

More Related