1 / 29

Problem

Problem. Parallelize (serial) applications that use files. Examples: compression tools, logging utilities, databases. In general applications that use files depend on sequential output, serial append is the usual file I/O operation. Goal: perform file I/O operations in parallel,

gerik
Download Presentation

Problem

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Problem • Parallelize (serial) applications that use files. • Examples: compression tools, logging utilities, databases. • In general • applications that use files depend on sequential output, • serial append is the usual file I/O operation. • Goal: • perform file I/O operations in parallel, • keep the sequential, serial append of the file.

  2. Results • Cilkruntime-support for serial append with good scalability. • Three serial append schemes and implementations for Cilk: • ported Cheerio, previous parallel file I/O API (M. Debergalis), • simple prototype (with concurrent Linked Lists), • extension, more efficient data structure (concurrent double-linked Skip Lists). • Parallel bz2 using PLIO.

  3. Single Processor Serial Append FILE (serial append) computation DAG 1 7 8 11 12 9 10 2 5 6 3 4

  4. Single Processor Serial Append FILE (serial append) 1 2 3 computation DAG 1 7 8 11 12 9 10 2 5 6 3 4

  5. Single Processor Serial Append FILE (serial append) 1 2 3 4 5 6 7 computation DAG 1 7 8 11 12 9 10 2 5 6 3 4

  6. Single Processor Serial Append FILE (serial append) 1 2 3 4 5 6 7 8 9 10 11 12 computation DAG 1 7 8 11 12 9 10 2 5 6 3 4

  7. Single Processor Serial Append FILE (serial append) 1 2 3 4 5 6 7 8 9 10 11 12 Why not in parallel?! computation DAG 1 7 8 11 12 9 10 2 5 6 3 4

  8. ParalleL file I/O (PLIO) support for Serial Append in Cilk Alexandru Caracaş Fast Serial Append

  9. Outline • Example • single processor & multiprocessor • Semantics • view of Cilk Programmer • Algorithm • modification of Cilk runtime system • Implementation • Previous work • Performance • Comparison

  10. Multiprocessor Serial Append FILE (serial append) computation DAG 1 7 8 11 12 9 10 2 5 6 3 4

  11. Multiprocessor Serial Append FILE (serial append) 1 2 7 computation DAG 1 7 8 11 12 9 10 2 5 6 3 4

  12. Multiprocessor Serial Append FILE (serial append) 1 2 3 5 7 8 9 computation DAG 1 7 8 11 12 9 10 2 5 6 3 4

  13. Multiprocessor Serial Append FILE (serial append) 6 1 2 3 4 5 7 8 9 10 computation DAG 1 7 8 11 12 9 10 2 5 6 3 4

  14. Multiprocessor Serial Append FILE (serial append) 1 2 3 4 5 6 7 8 9 10 11 12 computation DAG 1 7 8 11 12 9 10 2 5 6 3 4

  15. File Operations • open (FILE, mode) / close (FILE). • write (FILE, DATA, size) • processor writes to its PION. • read (FILE, BUFFER, size) • processor reads from PION. • Note: a seek operation may be required • seek (FILE, offset, whence) • processor searches for the right PION in the ordered data structure

  16. Semantics • View of Cilk programmer: • Write operations • preserve the sequential, serial append. • Read and Seek operations • can occur only after the file has been closed, • or on a newly opened file.

  17. Approach (for Cilk) • Bookkeeping (to reconstruct serial append) • Divide execution of the computation, • Meta-Data (PIONs) about the execution of the computation. • Observation • In Cilk,steals need to be accounted for during execution. • Theorem • expected # of steals = O ( PT∞ ). • Corollary (see algorithm) • expected # of PIONs = O ( PT∞ ).

  18. PION (Parallel I/ONode) • Definition: a PION represents all the write operations to a file performed by a processor in between 2 steals. • A PION contains: • # data bytes written, • victim processor ID, • pointer to written data. π1 π3 π2 π4 PION 1 2 3 4 5 6 7 8 9 10 11 12 FILE

  19. Algorithm • All PIONSs are kept in an ordered data structure. • very simple Example: Linked List. • On each steal operation performed by processor Pi from processor Pj: • create a new PION πi, • attach πi immediately after πj, the PION of Pj in the order data structure. PIONs πj π1 πk

  20. Algorithm • All PIONSs are kept in an ordered data structure. • very simple Example: Linked List. • On each steal operation performed by processor Pi from processor Pj: • create a new PION πi, • attach πi immediately after πj, the PION of Pj in the order data structure. πi PIONs πj π1 πk

  21. Algorithm • All PIONSs are kept in an ordered data structure. • very simple Example: Linked List. • On each steal operation performed by processor Pi from processor Pj: • create a new PION πi, • attach πi immediately after πj, the PION of Pj in the order data structure. PIONs π1 πj πk πi

  22. Implementation • Modified the Cilkruntime system to support desired operations. • implemented hooks on the steal operations. • Initial implementation: • concurrent Linked List (easier algorithms). • Final implementation: • concurrent double-linked Skip List. • Ported Cheerio to Cilk 5.4.

  23. Details of Implementation • Each processor has a buffer for the data in its own PIONs • implemented as a file. • Data structure to maintain the order of PIONs: • Linked List, Skip List. • Meta-Data (order maintenance structure of PIONs) • kept in memory, • saved to a file when serial append file is closed.

  24. Skip List • Similar performance with search trees: • O ( log (SIZE) ). NIL NIL NIL NIL

  25. Double-Linked Skip List • Based on Skip Lists (logarithmic performance). • Cilk runtime-support in advanced implementation of PLIO as rank order statistics. NIL NIL NIL NIL

  26. PLIO Performance • no I/O vs writing 100MB with PLIO (w/ linked list), • Tests were run on yggdrasil a 32 proc Origin machine. • Parallelism=32, • Legend: • black: no I/O, • red: PLIO.

  27. Improvements & Conclusion • Possible Improvements: • Optimization of algorithm: • delete PIONs with no data, • cache oblivious Skip List, • File system support, • Experiment with other order maintenance data structures: • B-Trees. • Conclusion: • Cilk runtime-support for parallel I/O • allows serial applications dependent on sequential output to be parallelized.

  28. References • Robert D. Blumofe and Charles E. Leiserson. Scheduling multithreaded computations by work stealing. In Proceedings of the 35th Annual Symposium on Foundations of Computer Science, pages 356-368, Santa Fe, New Mexico, November 1994. • Matthew S. DeBergalis. A parallel file I/O API for Cilk. Master's thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, June 2000. • William Pugh. Concurrent Maintenance of Skip Lists. Departments of Computer Science, University of Maryland, CS-TR-2222.1, June, 1990.

  29. References • Thomas H. Cormen, Charles E. Leiserson, Donald L. Rivest and Clifford Stein. Introduction to Algorithms (2nd Edition). MIT Press. Cambridge, Massachusetts, 2001. • Supercomputing Technology Group MIT Laboratory for Computer Science. Cilk 5.3.2 Reference Manual, November 2001. Available at http://supertech.lcs.mit.edu/cilk/manual-5.3.2.pdf. • bz2 source code. Available at http://sources.redhat.com/bzip2.

More Related