330 likes | 355 Views
An insightful update on the DARPA HPCS language initiative and challenges faced in scalable parallel programming. Includes analysis on current standards and emerging languages. Stay informed on the future of HPC technology.
E N D
One Language to Rule Them All or ADA Strikes Back?-- An Update on the DARPA HPCS Language Project Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory
Outline • The DARPA HPCS language project • Why it might not work • This has been tried before… • The “local maximum” problem • Why it might work • The HPCS languages as a group • The Plan
The Lay of the Land in Scalable Parallel Programming • The current standard: Fortran-77, Fortran-90 (2003), C, C++ programs with calls to MPI. Use of MPI-2 increasing, especially for I/O. Message-passing model. • The PGAS (Partitioned Global Address Space) languages: UPC, Co-Array Fortran, and Titanium. Each implements a similar programming model inside a different base language (C, Fortran-90, and Java, respectively). Static number of processes. Global view of data, but explicit local/remote distinction for data for performance. (2nd Annual PGAS Conference, Oct 3-4, Washington) • The HPCS languages: No fixed number of processes. Many advanced features. Global view of data, with locality hints. • http://crd.lbl.gov/~parry/hpcs_resources.html
The DARPA HPCS Language Project (in a nutshell) • The DARPA High Productivity Computer Systems (HPCS) Project is a 10-year, three-phase, hardware/software effort to transform the productivity aspect of the HPC enterprise. • In Phase II, just ending, three vendors were funded to develop high productivity language systems, and they are working on it. • IBM: X10 • Cray: Chapel • Sun: Fortress
DARPA HPCS Language Project (continued) • These languages are expected to run well on and exploit the HPCS hardware platforms being developed by all three vendors (soon to be less than three). • But it is recognized that, in order to be adopted, any HPCS language will also have to be effective on other parallel architectures. • (And, ironically, the HPCS hardware will have to run Fortran + MPI programs well.) • DARPA has also recently been funding a small academic effort (at Argonne (Gropp & Lusk), Berkeley (Yelick), Rice (Mellor-Crummey), and Oak Ridge (Harrison) • Connect the HPCS languages to research in compilers and runtime systems currently being done for the PGAS languages and MPI • Promote the eventual convergence of the language efforts into a single language.
“HPCS” languages have been tried before… • The Japanese 5th generation project had many similarities and near similarities to the DARPA program • 10 year program • A new parallel language (Concurrent Logic Programming) expressing a new programming model • New language presented as more productive • Multiple vendors, each implementing hardware support for the language, to address performance problems. • ADA was intended to replace both Fortran and Cobol. It included built-in parallelism. It was successful in its way: many (defense) applications were implemented in ADA. A large community of ADA programmers was created. • Neither language is with us today. It is not clear whether either even had any influence on our current parallel programming landscape.
Why It Might Not Work – Part 1: The Problem is Hard • Programmers do value productivity, but reserve the right to define it. • Past experience says that ambitious new languages are a risky business • The HPCS languages have some research issues in their paths • These languages will need not only heroic compilers but heroic runtimes • Performance not seriously addressed yet • Applications not inherent part of language design teams • More later on how these problems are being addressed
Why It Might Not Work – Part 2:the Competition Is Tough • MPI represents a very complete definition of a well-defined programming model • There are many implementations • Vendors • Open source • Enables high performance for wide class of architectures • Small subset easy to learn and use • Expert MPI programmers needed most for libraries, which are encouraged by the MPI design.
Why Is It So Hard to Get Past MPI?(Important to understand when contemplating alternatives) • Open process of definition • All invited, but hard work required • All drafts and deliberations open at all times • First decision was not to standardize on any existing system • Portability • Need not lead to lowest common denominator approach • MPI semantics allow aggressive implementations • Ubiquitous; applications can be developed anywhere • Performance • MPI can help manage the memory hierarchy • Collective operations can provide scalability • Cooperates with optimizing compilers • Simplicity • MPI (i. e., MPI-2) has 275 functions; is that a lot? (MFC) • Can write serious MPI applications with 6 functions • Economy of concepts, for example communicators, datatypes
Why Is It So Hard to Get Past MPI?(continued) • Modularity • MPI supports component-based software with communicators • Support for libraries means some applications may contain no MPI calls • Composability • MPI works with other tools (compilers, debuggers, profilers) • Provides precise interactions with multithreaded programs • Completeness • Any parallel algorithm can be expressed • Easy things are not always easy with MPI, but • Hard things are possible • Transparency • What you code is what you get • Implementation research continues in both vendor and open-source communities.
MPICH2 – 2nd-generation MPI-2 Implementation • Both research and software • Focus on performance, completeness, scalability • Algorithms for collective operations • New algorithms for handling datatypes • Collaborations with HPC vendors • Recent MPICH2 work has focused on improving low-level runtime interface and its implementation • This impacts all the portability variations of MPICH2
Why It Might Work (Why HPCS Languages Might Have an Impact) • There is widespread interest in the parallel programming issue, and MPI can be painful. • There is a transition path through the PGAS languages, where significant compiler and runtime research has been done. • There is a plan. • The vendors have all done very interesting work.
Looking at the HPCS Languages as a Group • Base Language • Creation of parallelism • Communication and/or data sharing • Synchronization among threads/processes • Locality (associating computation with data)
Base Language • The languages use different sequential bases • X10 uses an existing OO language, Java • Inherits good and bad features • Adds support for arrays, value types, parallelism • Gains tool support from Java environment • Chapel and Fortress use own OO language • Can tailor it to science (Fortress explores new character sets) • Large intellectual effort in getting base right • No specific analysis of base language • Question: How large a step can a new language be? • Successful new languages have been small steps • C, C++, Java
Creation of Parallelism • All three HPCS languages have parallel semantics • Not just data parallel with serial semantics • No reliance on automatic parallelism • All have: • Dynamic parallelism for loops as well as tasks • Mixed task and data parallelism • Encouragement for programmer to specify as much parallelism as possible, with the idea that the compiler/runtime will control how much is actually executed in parallel
Sharing and Communication • Chapel and Fortress are Global Address Space Languages • Can directly read/write remote variables • Similar to PGAS languages • X10 is a “parallel OO” language • Remote execution of methods • All are implicitly asynchronous • Sequential consistency within a “place”. • No collective communication. • Distributed arrays • In Chapel distribution is separate annotation • In X10 no direct remote access for distributed array data • In Fortress user-defined distributed data structures without explicit layout control
Synchronization • All three languages support atomic blocks • None of the languages have locks • (Semantically, locks are more powerful, but harder to manage than atomic sections) • Other mechanisms • X10 has “clocks” (barriers with dynamically attached tasks), conditional atomic sections, synchronization variables • Chapel has “single” (single writer) and “sync” (multiple readers and writers) variables) • Fortress has abortable atomic sections and a mechanism for waiting on individual spawned threads.
Locality • All three languages have a way to associate computation with data, for performance • It looks a little different in each language (“places” vs. “locales”) • Explicitly distributed data structures enable automation of this • Especially for arrays • Delegation of the problem to libraries, esp. in Fortress
Recent Activities on the HPCS Front • The HPCS/PGAS project has resulted in collaboration on several compiler/runtime issues with the vendors • Workshop at Oak Ridge in July, 2006 for assessment, comparison, and future planning • Tentative plan made for near- and medium-term developments • Incorporation into the larger HPCS productivity plan
Thoughts from the Recent Workshop • A main purpose of the workshop was to explicitly explore the topic of converging the HPCS languages to a single language. • Perhaps life would be tidier if there were only one HPCS language being discussed at this point. • But it would be less interesting. • And perhaps less effective in the long run. • Premature convergence might be a bad idea. • Some technical • Some philosophical • Diversity is good • One summary of the current situation is that there is lots of diversity • Challenge: how to make sure this remains good
Diversity in the Vendor Approaches • The HPCS vendors are not just designing three different versions of the same type of thing • Chapel: “To boldly go where no programming language has gone before” • X10: Extending existing language environment, fitting into larger structure • Fortress: Providing a framework for parallel language design; taking risks with visual aspects of programming text • It would be a shame to lose any of these points of view
But Some Uniformity in Things Missing So Far • Language support for high-performance I/O • Coalescing through collectivity • Tackling performance • Necessary in order to attract attention from applications
Diversity in Applications • One extreme: “Current practice is perceived as working well” • MPI being used as intended • The other: Cannot do critical applications at all without new way of programming. • More dynamism, expressivity needed • In between: No one is against productivity, but new approaches have to be strictly better than current ones to justify investment in change.
Diversity in Relevant CS Research • PGAS Languages – UPC, Co-Array Fortran, Titanium • Language design • Compilation techniques applicable to HPCS languages • Runtime system requirements • Deep collaboration with applications • Happening with PGAS, not yet with HPCS • Runtime system implementation • Could this level be standardized? • Semantics • Syntax • Would make it easier for computer scientists to design new languages, which is not necessarily a bad thing.
Some Paths Forward • Completion of current language design projects • Focus on performance credibility, at least for part of language • The UPC experience • Convergence • At all? • On What? • When? • Some lessons from a prior convergence effort • Could we carry out an MPI forum – like process?
The MPI Forum • A convergence process to produce portability among message-passing libraries • Regarded as successful in meeting its original goals • But not certain to succeed as it went along • Standardized some new ideas along the way • Message-passing libraries already existed • From vendors: mpl, eui, cmmd, nx, ncube, meiko, others • Open source: pvm, p4, others • They had similar semantics • There was application experience with all of these. • Vendors were pre-committed to adopting the MPI Forum result in their mainstream parallel computer products. • Decided up front not to pick an existing system • An implementation tracked the standard development.
Could Such a Process be Used In the HPCS Language Project? • It depends on what and when. • Certainly now is too early to try to produce a single language. • Perhaps it could work with a common runtime system, if everyone participated
A Proposed Schedule • Next 18 months: Vendors try to achieve: • “Frozen” version of syntax • Some workshops on common technical implementation issues • Preliminary work on common runtime system by CS researchers and vendors • Performance demo of some subset on some parallel machine • Next three years • Vendors improve performance • Applications get experience • Some evolution inevitable as implementers get experience • New common HPCS language design effort begins in 2010-11 • The shared experiences of the preceding three years make this go smoothly, and vendors track design with implementations. • 2013 • All applications become happily productive and remain so forever.
Some Possible Workshop Topics • Distributions • Memory models • Arrays and regions • Locales, places, etc. • Task parallelism • Common runtime • Synchronization • Atomic transactions • Parallel I/O embedded in the language • Types • Interactions with other languages, libraries • Applications • Tools for HPCS languages explcitly, especially debuggers • Just starting to plan some of these
Summary • The DARPA HPCS language project is active and healthy • Excellent work is being done by the vendors • It is time to get the larger community involved • Getting a new language, much less a new paradigm is an admitted challenge • The current situation is not all that bad, which doesn’t help. • There is a reasonable plan for moving forward at a measured pace. • There are a lot of languages…