1 / 20

NESL: Revisited

NESL: Revisited. Guy Blelloch Carnegie Mellon University. Experiences from the Lunatic Fringe. Guy Blelloch Carnegie Mellon University. Title: 1995 Talk on NESL at ARPA PI Meeting. NESL : Motivation. Language for describing parallel algorithms Ability to analyze runtime

mika
Download Presentation

NESL: Revisited

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NESL: Revisited Guy Blelloch Carnegie Mellon University

  2. Experiences from the Lunatic Fringe Guy Blelloch Carnegie Mellon University Title: 1995 Talk on NESL at ARPA PI Meeting

  3. NESL : Motivation Language for describing parallel algorithms • Ability to analyze runtime • To describe known algorithms Portable across different architectures • SIMD and MIMD • Shared and Distribute memory Simple • Easy to program, analyze and debug

  4. NESL : In a nutshell Simple Call-by-Value Functional Language + Built in Parallel type (nested sequences) + Parallel map (apply-to-each) + Parallel aggregate operations + Cost semantics (work and depth) *Sequential Semantics* Some non-pure features at “top level”

  5. NESL : History • Developed in 1990 • Implemented on CM, Cray, MPI, and sequentially using a stack based intermediate language • Interactive environment with remote calls • Over 100 algorithms and applications written – used to teach parallel algorithms • Mostly dormant since 1997

  6. Original “mapquest” Web based interface for finding addresses Zooming, panning, finding restaurants

  7. NESL : Nested Sequences Built-in “parallel” type: [3.0, 1.0, 2.0] : [float] [[4, 5, 1, 6], [2], [8, 11, 3]] : [[int]] “Yoknapatawpah County” : [char] [“the”, “rain”, “in”, “Spain”] : [[char]] [(3,”Italy”), (1, “sun”)] : [int*[char]]

  8. NESL: Parallel Map A = [3.0, 1.0, 2.0] B = [[4, 5, 1, 6], [2], [8, 11, 3]] C = “Yoknapatawpah County” D = [“the”, “rain”, “in”, “Spain”] Sequence Comprehensions: {x + .5 : x in A} -> [3.5, 1.5, 2.5] {sum(b) : b in B} -> [16, 2, 22] {c in C | c < ‘n} -> “kaaaahc” {w[0] : w in D} -> “triS”

  9. NESL : Aggregate Operations A = [3.0, 1.0, 2.0] D = [“the”, “rain”, “in”, “Spain”] E = [(3,”Italy”), (1,“sun”)] Parallel write : [‘a] * [int*‘a] -> [‘a] D <- E -> [“the”,“sun”,“in”,“Italy”] Prefix sum : (‘a*‘a->‘a)*‘a*[‘a] -> [‘a]*‘a scan(‘+,2.0,A) -> ([2.0,5.0,6.0],8.0) plus_scan(A) -> [0.0,3.0,4.0] sum(A) -> 6.0

  10. NESL: Cost Model Combining for parallel map: pexp = {exp(e) : e in A} Can prove runtime bounds for PRAM: T = O(W/P + D log P)

  11. NESL Other Libraries • String operations • Graphical interface • CGI interface for web applications • Dictionary operations (hashing) • Matrices

  12. Example : Quicksort (Version 1) function quicksort(S) = if (#S <= 1) then S else let a = S[rand(#S)]; S1 = {e in S | e < a}; S2 = {e in S | e = a}; S3 = {e in S | e > a}; in quicksort(S1) ++ S2 ++ quicksort(S3); D =O(n) W = O(n log n)

  13. Example : Quicksort (Version 2) function quicksort(S) = if (#S <= 1) then S else let a = S[rand(#S)]; S1 = {e in S | e < a}; S2 = {e in S | e = a}; S3 = {e in S | e > a}; R = {quicksort(v) : v in [S1, S3]}; in R[0] ++ S2 ++ R[1]; D = O(log n) W = O(n log n)

  14. Example : Representing Graphs 0 2 3 1 4 Edge List Representation: [(0,1), (0,2), (2,3), (3,4), (1,3), (1,0), (2,0), (3,2), (4,3), (3,1)] Adjacency List Representation: [[1,2], [0,3], [0,3], [1,2,4], [3]]

  15. Use hashing to avoid non-determinism Example : Graph Connectivity L = Vertex Labels, E = Edge List function randomMate(L, E) = if #E = 0 then L else let FL = {randBit(.5) : x in [0:#L]}; H = {(u,v) in E | Fl[u] and not(Fl[v])}; L = L <- H; E = {(L[u],L[v]): (u,v) in E | L[u]\=L[v]}; in randomMate(L,E); D = O(log n) W = O(m log n)

  16. Lesson 1: Sequential Semantics • Debugging is much easier without non-determinism • Analyzing correctness is much easier without non-determinism • If it works on one implementation, it works on all implementations • Some problems are inherently concurrent—these aspects should be separated

  17. Lesson 2: Cost Semantics • Need a way to analyze cost, at least approximately, without knowing details of the implementation • Any cost model based on processors is not going to be portable – too many different kinds of parallelism

  18. Lesson 3: Too Much Parallelism Needed ways to back out of parallelism • Memory problem • The “flattening” compiler technique was too aggressive on its own • Need for Depth First Schedules or other scheduling techiques • Various bounds shown on memory usage

  19. Limitations Communication was a bottleneck on machines available in the mid 1990s and required “micromanaging” data layout for peak performace. Language would needs to be extended • PSCICO Project (Parallel Scientific Computing) was looking into this Hard to get users for a new language

  20. Relevance to Multicore Architecture • Communication is hopefully better than across chips • Can make use of multiple forms of parallelism (multiple threads, multiple processors, multiple function units) • Schedulers can take advantage of shared caching [SPAA04] • Aggregate operations can possibly make use of on-chip hardware support

More Related