170 likes | 315 Views
Parallel I/O. In. Bulk-Synchronous Parallel ML. Frédéric Gava. Outline. Introduction The BSP model The BSML language External Memory in BSML Cost model Problems and solutions Conclusion and Future Work. Introduction. Bulk Synchronous Parallelism + Functional Programming = BSML.
E N D
Parallel I/O In Bulk-Synchronous Parallel ML Frédéric Gava PAPP 2004 Gava
Outline • Introduction • The BSP model • The BSML language • External Memory in BSML • Cost model • Problems and solutions • Conclusion and Future Work PAPP 2004 Gava
Introduction PAPP 2004 Gava
Bulk Synchronous Parallelism + Functional Programming = BSML • Advantages of the BSP model: • Portability • Scalability, deadlock free • Simple cost model Performance prediction • Advantages of functional programming: • High level features (higher order functions, pattern-matching, concrete types, etc…) • Savety of the environment • Programs Proofs PAPP 2004 Gava
The Caraml Project • Funds by the ACI Grid program (French National Grid program) • Tools and applications • Organized in 3 phases: • First phase: safety • Second phase: multiprogramming • Third phase: extensions for Grid computing PAPP 2004 Gava
The BSP model 0 1 2 3 p-1 Proc. T(s) = (max0i<p wi) + hg + L
The BSML language • Library for the « Objective Caml » language (called BSMLlib) • Operations on a parallel data structure called vector:par • Operations to access to the BSP parameters : • 4 Operations on a parallel vectors PAPP 2004 Gava
Global Conditional b0 b1 … true … bp-1 if vec at n then … else … = if at n then e1 else e2 n e1
External Memory PAPP 2004 Gava
Model • We have: • M = Size of the main memory • D = Number of disks • B = Size of one block in a disk • G = Time to read/write in parallel B blocks (D*B data)
Problem « local side effects » => modification of the global environment let bug = mkpar (fun pid -> if pid=0 then open_write « toto.dat » else NOTHING) in open_read « toto.dat » PAPP 2004 Gava
Solutions • Two file systems • local files => one files system on each process • global files : • a shared files system • or replicate local files on a different directory) • New primitives for the differents files • Confluence of the semantics • Compositional cost model
Example Scan_list : scan_list (+) <[0;1], [2;3], [4] > < [0;0+1], [0+1+2;0+1+2+3], [0+1+2+3+4] > Read/write values in blocks using tempory files PAPP 2004 Gava
Conclusion • BSML = BSP + ML • External Memory in BSML • New cost model • New Primitives • Confluence • Compositional cost model PAPP 2004 Gava
Future Work • Polymorphic type system for BSML with I/O • Implementation of « big » applications • Add to BSML : • Parallel composition • Exceptions • Pattern – matching of parallel values