1 / 43

X10: IBM’s bid into parallel languages

Learn about X10, a new language based on Java, developed by IBM for non-uniform computing clusters (NUCCs). X10 introduces the concept of Partitioned Global Address Space (PGAS) to address scalability issues in parallel languages.

normancruz
Download Presentation

X10: IBM’s bid into parallel languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. X10: IBM’s bid into parallel languages Paul B Kohler Kevin S Grimaldi University of Massachusetts Amherst

  2. introduction • A new language based of Java • IBM’s entry to the DARPA’s PERCS project (Productive Easy-to-use Reliable Computer Systems) • Built for NUCCs(Non-Uniform Computing Clusters) where different memory locations incur different cost.

  3. intro continued • Will eventually be combined with new tools for Eclipse • Goals • Safe • Analyzable • Scalable • Flexible

  4. PGAS • Past attempts at parallel languages have used the illusion of a single shared memory • This does not represent the situation in NUCC. • Problems occur when we try divide memory among processors. • X10 uses PGAS to reveal the non-uniformity and make the language scalable.

  5. PGAS(co nt) • PGAS=Partitioned Global Address Space • Memory partitioned into places. Data is associated with a place and can only be read/changed locally. • Provided in X10 through the abstractions of places and activities.

  6. Places • Contain a collection of resident mutable data objects and associated activities • Places represent locality boundaries • Very efficient access to resident data • Set of places remains fixed at runtime • Places are virtual • Mapped to physical processors by runtime • Runtime may transparently migrate places

  7. Using Places • Accessible via place.places • First activity runs at place.FIRST_PLACE • Iterate over places with next() and prev() • here represents current place

  8. Activities • Similar to java threads. • Activities are associated with a place. • Activities never migrate places. • Activities may only read/modify mutable data that is local to its place. • However immutable data (i.e.final or value) maybe accessed by any activity.

  9. Activities (cont) • Activities are GALS(Globally Asynchronous Locally Synchronous) • Local data accesses are synchronized • Global data accesses are not by default. Synchronization can be explicitly forced.

  10. Activities:Syntax • It is very simple to spawn new activities: async(place)statement • This runs the specified statement at the specified place. • Example: • final int result; async(here.next()){result=a+b} This would add two numbers at the adjacent place and store the result(since result is final it can be accessed by other places)

  11. Type System • X10 is strongly typed • Unified type system • Everything is an object; no primitive types • Library supplies boolean, byte, short, char, int, long, float, double, complex, String classes • Borrows Java’s single inheritance combined with interfaces

  12. Reference vs Value Types • Two types of objects • Value types are immutable and can be freely copied • Reference types can contain mutable fields but cannot be migrated • Value classes are declared value keyword instead of class • Value classes can still contain fields that are of reference types • Allows them to refer to mutable data • Copying ‘bottoms out’ on reference fields

  13. Type System (cont) • Objects are either scalar or aggregate • Each of value and reference types can be either scalar or aggregate • Types consist of two parts • Data type – The set of values it can take • Place type – The place at which it resides • No generics (yet)

  14. Variables • Variables must be initialized (can never be observed without a value) • final variables cannot be changed after initialization • Declared by using the final keyword and/or using a variable name that starts with a capital letter

  15. Nullable Types • Designers view ability to hold null value as orthogonal to value vs reference type • Either reference or value types can be preceded by nullable • Adds a null value to the type • Multiple nullables are collapsed (i.e. nullable nullable T = nullable T) • Can cast between T and nullable T • (nullable T) v always succeeds • (T) null throws an exception if T is not nullable

  16. Rooted exceptions • What should happen when a thread/activity terminates abnormally? • In java it’s unclear since the spawning thread may have already terminated. • X10 uses a rooted exception model. All uncaught exceptions get passed to the calling activity. • A new blocking command finish s is introduced. This command waits for all activities in s to terminate before proceeding.

  17. Exceptions (cont) • Finish allows exceptions to travel back towards the root activity and possibly be caught and handled along the way. • Example: try{ finish async(here.next()){ throw new Exception(); } } catch(Exception e){ }

  18. Arrays • X10 features an array sub-language similar to ZPL. • Arrays have: • Regions • Distributions • Arrays are operated on by: • for • foreach • ateach • And more!

  19. Even more arrays • Arrays may be value(immutable) or reference(mutable) • Keyword unsafe allows arrays that will play nice with java code. • Arrays can run code as an initialization step.

  20. Arrays:Regions • Regions:As in ZPL a region is a set of indexed data points. • Regions and distributions are first class constructs. • Regions can be specified like this: • [0:128,0:256] creates a region 128x256

  21. Regions(cont.) • Regions can be modified by operation such as union(||), intersection(&&) and set difference(-). • Predefined regions types can be constructed using factories. region R2 = region.factory.upperTriangular(25) • In the future users may be able to define there own regions.

  22. Arrays:Distributions • Every array has a distribution. • A distribution is mapping of array elements to places. • Distributions are over a particular region. • Arrays are typed by their distribution.

  23. Distributions cont. • Currently must use pre-defined distributions(unique,block,cyclic…etc.) • Have set operations like regions. • Can be used as functions so for a point p and distribution d: d[p]=place which point p maps to(i.e. where the p’th element “lives”).

  24. Subarrays • Use various boolean operations on distributions to create subdistributions • To get the portion of a block distribution that is located here: block([1:100]) && [1:100]->here • a | D1 is the portion of array a corresponding to the subdistribution D1

  25. Array construction • Here is an example of array initialization: float [.] data= new[factory.cyclic([0:200,50:250])] (point [i, j]){return i+j};

  26. Array construction • Here is an example of array initialization: float [.] data= new[factory.cyclic([0:200,50:250])] (point [i, j]){return i+j}; • This specifies a 200x200 region

  27. Array construction • Here is an example of array initialization: float [.] data= new[factory.cyclic([0:200,50:250])] (point [i, j]){return i+j}; • This specifies a 200x200 region. • This specifies a cyclic distribution over the region.

  28. Array construction • Here is an example of array initialization: float [.] data= new[factory.cyclic([0:200,50:250])] (point [i, j]){return i+j}; • This specifies a 200x200 region. • This specifies a cyclic distribution over the region. • This code initialize each element to the some of its i,j coordinates

  29. Array iteration • Once you have an array what can you do with it? • Array iterators: for, foreach, ateach • for: Sequentially iterates over a supplied region. At each point it binds the point to a variable and executes the accompanying statement. • foreach: As with for but operations are done in parallel. That is it spawns a new activity for each point. • ateach: takes a distribution instead of a region. Performs operations in parallel at the place specified by the distribution.

  30. Iteration example • Example: for(point p : A){ A[p]=A[p]*A[p] }

  31. More array ops • lift: Takes a binary function and two arrays of the same distribution. Produces a new array formed by a pointwise application of the function to the two arrays. • reduce: As in MPI applies a binary function to every element to produce a single value. • scan: Creates a new array where the i’th element is the result of reduction on the first i elements.

  32. Atomic Blocks • X10 allows you to define atomic blocks • The contents of a block is guaranteed to execute as a single atomic event. This is only in regards to other activities in the same place. • While this is guaranteed to be atomic the details are implementation specific. • Syntax: atomic S

  33. Conditional Atmc Blck • Also provides: when(Cond) S • This blocks until cond is true and then executes S atomically. • This allows the creation of a number of synchronization mechanisms. • Dangerous! If cond is never true or if there is a cycle deadlock occurs.

  34. Future and Force • As discussed before futures allow the asynchronous computation of a value that may be used in the future. • Futures return a object of type Future<T> • Force is a blocking call that waits for a particular future to be finished

  35. Futures(cont.) • Can only access final variables. This prevents side effects. • Syntax: future(p)e • Example: Future <float> blah = future(here.next){sqrt(a^2+b^2)};

  36. Clocks • Act as barriers • Much more flexible • Guarantee no deadlock • Dynamically associated with different sets of activities

  37. Clock Semantics • Activities register with zero or more clocks • Can register/unregister at any time • Clocks are always in some phase • Do not advance until allcurrently registered activities quiesce • Activities quiesce with next operation • Indicates they are ready for all their clocks to advance • Suspends until all clocks have advanced • This makes deadlock impossible

  38. Status • IBM has supposedly built a single VM reference implementation • Language still under heavy revision • GPL’ed X10-XTC compiler available • Doesn’t conform to current language spec • Uses what will possibly be version 0.5 • Speculatively contains support for operator overloading and generics • Currently very poor performance

  39. conclusion • So is X10 the answer to all our parallel programming woes?

  40. conclusion • So is X10 the answer to all our parallel programming woes? • In my opinion probably not.

  41. conclusion • So is X10 the answer to all our parallel programming woes? • In my opinion probably not. • Parallelism still very explicit. Still opportunities for deadlock, race conditions etc.

  42. conclusion • So is X10 the answer to all our parallel programming woes? • In my opinion probably not. • Parallelism still very explicit. Still opportunities for deadlock, race conditions etc. • Takes a “…and the kitchen sink” approach which makes learning the syntax a chore.

  43. conclusion • So is X10 the answer to all our parallel programming woes? • In my opinion probably not. • Parallelism still very explicit. Still opportunities for deadlock, race conditions etc. • Takes a “…and the kitchen sink” approach which makes learning the syntax a chore. • It’s not FORTRAN. Will people bother to use it?

More Related