300 likes | 414 Views
A Modular Impl ementation of Parallel Data Structures in Bulk-Synchronous Parallel ML. Frédéric Gava. Outline. Introduction; The BSML language; Impl emen tation of parallel data structures in BSML : Dictionaries; Set s; Load -Balancing. Application; Conclusion and futur works.
E N D
A Modular Implementation of Parallel Data Structures in Bulk-Synchronous Parallel ML Frédéric Gava F. Gava, HLPP 2005
Outline • Introduction; • The BSML language; • Implementation of parallel data structures in BSML: • Dictionaries; • Sets; • Load-Balancing. • Application; • Conclusion and futur works. F. Gava, HLPP 2005
Introduction Automatic Parallelization Structured Parallelism Concurrent Programming Algorithmic Skeletons BSP Data Structures Skeletons • Parallel Computing for speed; • To complex for many non-computer scientists; • Need for models/tools of parallelism. F. Gava, HLPP 2005
Introduction (bis) • Observations: • Data Structures also important as algorithms; • Symbolic computations used massively those data structures. • Suggested solution, parallel implementations of data structures: • Interfacesas close as possible to the sequential ones; • Modular implementations to have a straightforwardmaintenance; • Load-balancing of the data. F. Gava, HLPP 2005
BSML Outline: • Introduction; • BSML; • Parallel Data Structures in BSML; • Application; • Conclusion and futur works. F. Gava, HLPP 2005
Bulk-Synchronous Parallelism + Functional Programming = BSML • Advantages of the BSP model: • Portability; • Scalability, deadlock free; • Simple cost model performance prediction. • Advantages of functional programming: • High-level features (higher order functions, pattern-matching, concrete types, etc…); • Savety of the environment; • Programs Proofs (proof of BSML programs using Coq). F. Gava, HLPP 2005
The BSMLLanguage • Confluent language: deterministic algorithms; • Library for the « Objective Caml » language (called BSMLlib); • Operations to access to the BSP parameters; • 5 primitives on a parallel data structure called parallel vector: • mkpar:create a parallel vector; • apply: parallel point-wise application; • put: send values within a vector; • proj: parallel projection; • super: BSP divide-and-conquer. F. Gava, HLPP 2005
A BSML Program f0 g0 f1 g1 … … fp-1 gp-1 Sequential part Parallel part F. Gava, HLPP 2005
Parallel Data Structures in BSML Outline: • Introduction; • BSML; • Parallel Data Structures in BSML; • Application; • Conclusion and futur works. F. Gava, HLPP 2005
General Points • 5 modules: Set, Map, Stack, Queue, Hashtable; • Interfaces: • Same as O’Caml ones; • With some specific parallel functions (squeletons) as parallel reduction; • Pure functional implementationx (for functional data); • Manual or Automatic load-balancing. F. Gava, HLPP 2005
Modules in O’Caml • Interface: module type Compare = sig type elt val compare : elt -> elt -> int end • Implementation: module CompareInt = struct type elt=int let tools = ... let compare = ... end module AbstractCompareInt = (CompareInt : Compare) • Functor: module Make(Ord: Compare) = struct type elt = Ord.elt type t = Empty | Node of t * elt * t * int let mem e s = ... end
Parallel Dictionaries • A parallel map (dictionary) = a mapon each processor: module Make (Ord : OrderedType)(Bal:BALANCE) (MakeLocMap:functor(Ord:OrderedType) -> Map.S with type key=Ord.t) = struct module Local_Map = MakeLocMap(Ord) type key = Ord.t type 'a t = ('a Local_Map.t par) * int * bool type seq_t = Local_Map.t (* operators as skeletons *) end • We need to re-implement all the operations (data skeletons). F. Gava, HLPP 2005
Insert a Binding • add: key 'a 'a t 'a t If rebalanced Otherwise F. Gava, HLPP 2005
Parallel Iterator Let cardinal pmap=ParMap.fold (fun _ _ ii+1) 0 pmap • Foldneed to respect the order of the keys; • Parallel map sequential map; • Too many communications… • async_fold: (key'a'b'b)'a t'b'b par let cardinal pmap=List.fold left (+) 0 (total(ParMap.async fold (fun _ _ ii+1) pmap 0)) F. Gava, HLPP 2005
Parallel Sets • A sub-set on each processor; • Insert/Iteration as parallel maps; • But with some binary skeletons; • Load-balancing of couples of parallel sets using thesuperposition. F. Gava, HLPP 2005
Difference • 3 cases: • Two normal parallel sets; • One of the parallel sets has been rebalanced; • The two parallel sets have been rebalanced; • Imply a problem with duplicate elements. F. Gava, HLPP 2005
Difference (third case) S1 S2 F. Gava, HLPP 2005
Load-Balancing (1) • « Same sizes » of the local data structures; • Better performances for parallel iterations; • Load-Balancing in 2 super-steps (M. Bamha and G. Hains) using a histogram F. Gava, HLPP 2005
Load-Balancing (2) • Generic code of the algorithm: Select « n » messages Data || datas • rebalance: (par) (int g list * ) • ( ) (glist ) • (par ) ( int par) • Union Messages data Datas data || Histogram Data || F. Gava, HLPP 2005
Application Outline: • Introduction; • BSML; • Parallel Data Structures in BSML; • Application; • Conclusion and futur works. F. Gava, HLPP 2005
Computation of the « nth » nearest neighbors atom in a molecule • Code from «Objective Caml for Scientists » (J. Harrop); • Molecule as a infinitely-repeated graph of atoms; • Computation of sets differences (the neighbors); • Replace « fold » with « async_fold »; • Experimentswitha silicate of 100.000 atoms and with a cluster of 5/10 machines (Pentium IV, 2.8 Ghz, Gigabit Ethernet Card). F. Gava, HLPP 2005
Conclusion and Futur Works Outline: • Introduction; • BSML; • Parallel Data Structures in BSML; • Application; • Conclusion and futur works. F. Gava, HLPP 2005
Conclusion • BSML=BSP+ML; • Implementation of some data structures; • Modular for a simple development and maintenance; • Pure functional implementation; • Cost prediction with the BSP model; • Generic Load-balancing; • Application. F. Gava, HLPP 2005
Futur Works • Proof of the implementations (pure functional); • Implementation of another data structures (tree, priority list etc.); • Application to another scientist problems; • Comparison with another parallel ML (OCamlP3L, HirondML, OCaml-Flight, MSPML etc.); • Development of a modular and parallel graph library: • Edges as parallel maps; • Vertex as parallel sets. F. Gava, HLPP 2005