240 likes | 372 Views
Chapel: The Cascade High Productivity Language. Ting Yang University of Massachusetts Amherst. Context. HPCS = High Productivity Computing Systems Programmability Performance Portability Robustness Cascade = Cray’s HPCS Project System-wide consideration of productivity impacts
E N D
Chapel: The Cascade High Productivity Language Ting Yang University of Massachusetts Amherst
Context HPCS = High Productivity Computing Systems • Programmability • Performance • Portability • Robustness Cascade = Cray’s HPCS Project • System-wide consideration of productivity impacts • Processors, memory, network, OS • Runtime, compilers, languages Chapel = Cascade High-Productivity Language DARPA HPCS Program Cray’s Cascade Project Chapel Language Sun IBM
Introduction – Why Chapel • Fragmented Model: MPI, SHMEM, UPC • Write code on processor-by-processor basis • Break data structure • Break control flow • Mix algorithms with per-processor management details in the computation • Virtual processor topology • Communication details • Choice of data structures, memory layout • Fail to support composition of parallelism • Lack of productivity, flexibility, portability. • Difficult to understand and maintain
Introduction • Global-view Model: HPF, OpenMP, ZPL, NESL • Need not decompose data and control flow • Decomposition: compiler and runtime • Users provide high level guides • Natural and Intuitive • Lack of abstractions: • set, hash, graph • Performance is not as good as MPL. • Difficult to compile
Introduction - Chapel • Chapel: Cascade High-Productivity Language • Built from HPF and ZPL • Strictly typed • Overall goal: • Simplify the creation of parallel programs • Provide high-performance production-grade codes • More generality • Motivating Language Technologies: • Multithreaded parallel programming • Locality-aware programming • Object-oriented programming • Generic programming and type inference
Outline • Introduction • Multithreaded Parallel Programming • Data Parallel • Task Parallel • Locality-aware Programming • Data Distribution • Computation Distribution • Other Features • Summery
Multithreaded Parallel Programming • Provide global view of computation and data structures • Composition of parallelism • Abstraction of data and task parallelism • Data: domains, arrays, graphs, • Task: cobegins, atomic, sync variables • Virtualization of threads • locales
Data Parallelism: Domains • Domain: an index set (first class) • Specifies the size and shape of “arrays” • Support sequence and parallel iteration • Potentially decomposed across locales • Each domain has an index type: index(domain) • Fundamental concept of data parallelism • Generalization of ZPL’s region • Important Domains • Arithmetic: indices are Cartesian tuples • Arrays, multidimensional Arrays • Can be strided and arbitrarily sparse • Infinite: indices are hash keys • Maps, hash tables, associative arrays • Opaque: anonymous • Sets, trees, graphs • Others: Enumerate
Domain Uses • Declaring Arrays var A, B [D] : float • Sub-array references A(DInner) = B(DInner); • Sequential iteration for (i,j) in Dinner { … A(I,j)… } or:for ij in Dinner { …A(ij)… } • Parallel iteration forall (i,j) in Dinner { … A(I,j)… } or:for [ij in Dinner { …A(ij)… } • Array re-allocation D = [1..2*m, 1..2/n] A B ADInner BDInner D D
Infinite Domains var People: domain(string); var Age: [People] integer; var Birthdate: [People] string; Age(“john”) = 60; Birthdate[“john”] = “12/11/1946” forall person in People { if (Birthdate(person) == today ) { Age(person) += 1; } }
Opaque Domains var Vertices: domain(opaque) for i in (1..5) { Vertices.newIndex(); } Var AV, BV: [Vertices] float Vertices AV BV
Building A Tree var Vertices: domain(opaque); var left, right: [Vertices] index(Vertices); var root: index(Vertices); root = Vertices.newIndex(); left(root) = Vertices.newIndex(); right(root) = Vertices.newIndex(); left(right(root)) = Vertices.newIndex(); root
The Domain/Index Hierarchy • Every Domain has an Index type • Eliminates most runtime boundary checks
Task Parallelism • co-begins: statements that may run in parallel cobegin { ComputeTaskA (…); ComputeTaskB (…); } • atomic blocks atomic { newnode.next = insertpt; newnode.prev = insertpt.prev; insertpt.prev.next = newnode; insertpt.prev = newnode; } • sync and single-assignment variables • Synchronize tasks ComputeTaskA ( ) { cobegin { ComputeTaskC (…); ComputeTaskD (…); } ComputeTaskE(…); }
Outline • Introduction • Multithreaded Parallel Programming • Data Parallel • Task Parallel • Locality-aware Programming • Data Distribution • Computation Distribution • Other Features • Summery
Locality-aware programming • locale: machine unit of storage and processing • Specify number of locales on command-line ./myProgram –nl 8 • Chapel provides with built-in locale array: const Locales: [1..numLocales] locale; • Users may define their own locale arrays: varCompGrid: [1..GridRows, 1..GridCols] locale =…; varTaskALocs: [1..numTaskALocs] locale= …; varTaskBLocs: [1..numTaskBLocs] locale= …;
Data Distribution • Domains can be distributed across locales var D: domain(2) distrubuted(block(2) to CompGrid) = …; • Distributions specified by • Mapping of indices to locales • Per-locale storage layout of domain indices and array element • Distributions implemented a a class hierarchy • Chapel provides a group of standard distributions • User may also write their own ??? • Support reduceand scan(parallel prefix) • Including user-defined operations
Computation Distribution • “on” keyward associates tasks to locale(s) • “on” can also used as data-driven manner
Outline • Introduction • Multithreaded Parallel Programming • Data Parallel • Task Parallel • Locality-aware Programming • Data Distribution • Computation Distribution • Other Features • Summery
Other Features • Object Oriented Interface • Optional OO style • overloading • Advanced language features expressed in class • Generics and Type Inferences • Type variables and Parameters • Similar to class template in C++ • Sequences (“seq”), iterators; • “ordered” keyword suppresses parallelism • Modules (for name-space management) • Parallel garbage collection ???
Outline • Introduction • Multithreaded Parallel Programming • Data Parallel • Task Parallel • Locality-aware Programming • Data Distribution • Computation Distribution • Other Features • Chapel Status
Chapel Status • First sequential prototype on one locale • Not finished yet • Currently can run programs • simple domains up to 2-dimensions • partial type Inferences • Threads locales processors • A full prototype in one or two years