1 / 10

Next Generation Petascale Programming: High Productivity Language and Systems

This project aims to guide the development of high productivity, high-performance scientific programming environments for Petascale machines by the year 2010. It explores new programming languages and systems, emphasizing performance, productivity, and advancements in concurrency.

dhelen
Download Presentation

Next Generation Petascale Programming: High Productivity Language and Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High Productivity Language and Systems: Next Generation Petascale Programming Wael R. Elwasif, David E. Bernholdt, and Robert J. Harrison Computer Science & Mathematics Division Oak Ridge National Laboratory

  2. Guiding the development of the next generation HPC programming environments Joint project with: • Argonne National Laboratory • Lawrence Berkeley National Laboratory • Rice University Part of DARPA HPCS program Chart the course towards high productivity, high performance scientific programming on Petascale machines by the year 2010. Candidate Languages: • Chapel (Cray) • Fortress (Sun) • X10 (IBM)

  3. Revolutionary approach to large scale parallel programming • Emphasis on performance and productivity • Not SPMD • Lightweight “threads”, LOTS of them • Different approaches to locality awareness/management • High level (sequential) language constructs • Rich array data types (part of the base languages) • Strongly typed object oriented base design • Extensible language model • Generic programming

  4. Concurrency: the next generation • Single initial thread of control • Parallelism through language constructs • True global view of memory, one sided access model • Support task and data parallelism • “Threads” grouped by “memory locality” • Extensible, rich distributed array capability • Advanced concurrency constructs: • Parallel loops • Generator-based looping and distributions • Local and remote futures

  5. What about productivity? • Index sets/regions for arrays • “Array language” (Chapel, X10) • Safe(r) and more powerful language constructs • Atomic sections vs locks • Sync variables and futures • Clocks (X10) • Type inference • Leverage advanced IDE capabilities • Units and dimensions (Fortress) • Component management, testing, contracts (Fortress) • Math/science-based presentation (Fortress)

  6. An exercise in MADNESS!! • Multiresolution ADaptive NumErical Scientific Simulation • Multiresolution, multiwavelet representation for quantum chemistry and other domains • Adaptive mesh in 1-6+ dimensions • Very dynamic refinement • Distribution by subtrees == spatial decomposition • Sequential code very compactly written using recursion • Initial parallel code using MPI is about 10x larger

  7. The MADNESS continues Explicit MPI template <typename T> void Function<T>::_refine(OctTreeTPtr& tree) { if (tree->islocalsubtreeparent() && isremote(tree)) { FOREACH_CHILD(OctTreeTPtr, tree, bool dorefine; comm()->Recv(dorefine, tree->rank(), REFINE_TAG); if (dorefine) { set_active(child); set_active(tree); _project(child); } _refine(child);); } else { TensorT* t = coeff(tree); if (t) { TensorT d = filter(*t); d(data->cdata->s0) = 0.0; bool dorefine = (d.normf() > truncate_tol(data->thresh,tree->n())); if (dorefine) unset_coeff(tree); FOREACH_REMOTE_CHILD(OctTreeTPtr, tree, comm()->Send(dorefine, child->rank(), REFINE_TAG); set_active(child);); FORIJK(OctTreeTPtr child = tree->child(i,j,k); if (!child && dorefine) child = tree->insert_local_child(i,j,k); if ( child && islocal(child)) { if (dorefine) { _project(child); set_active(child); } _refine(child); }); } else { FOREACH_REMOTE_CHILD(OctTreeTPtr, tree, comm()->Send(false, child->rank(), REFINE_TAG);); FOREACH_LOCAL_CHILD(OctTreeTPtr, tree, _refine(child);); } } } Cilk-like multithreaded all of the HPLS solutions look like this template <typename T> void Function::_refine (OctTreeTPtr tree) { FOREACH_CHILD(OctTreeTPtr, tree, _project(child);); TensorT* t = coeff(tree); TensorT d = filter(*t); if (d.normf() > truncate_tol(data->thresh,tree->n())) { unset_coeff(tree); FOREACH_CHILD(OctTreeTPtr, tree, spawn _refine(child);); } } The complexity of the MPI code mostly arises from using an SPMD model to maintain a consistent distributed data structure. The explicitly distributed code must work on active and inactive nodes – not necessary with global-view and remote activity creation.

  8. Tradeoffs in HPLS language design • Emphasis on parallel safety (X10) vs. expressivity (Chapel, Fortress) • Locality control and awareness: • X10: explicit placement and access • Chapel: user-controlled placement, transparent access • Fortress: placement “guidance” only, local/remote access blurry (data may move!!!) • What about mental performance models? • Programming language representation: • Fortress: Allow math-like representation • Chapel, X10: Traditional programming language front end • How much do developers gain from mathematical representation? • Productivity/Performance tradeoff • Different users have different “sweet-spots”

  9. Remaining challenges • (Parallel) I/O model • Interoperability with (existing) languages and programming models • Better (preferably portable) performance models and scalable memory models • Especially for machines with 1M+ processors • Other considerations: • Viable gradual adoption strategy • Building a complete development ecosystem

  10. Contacts Wael R. Elwasif Computer Science and Mathematics Division (865) 241-0002 elwasifwr@ornl.gov David E. Bernholdt Computer Science and Mathematics Division (865) 574-3147 bernholdtde@ornl.gov Robert J. Harrison Computer Science and Mathematics Division (865) 241-3937 harrisonrj@ornl.gov 10 Elwasif_0611

More Related