1 / 17

Opportunities in Parallel I/O for Scientific Data Management

Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory. Opportunities in Parallel I/O for Scientific Data Management. Outline. Brief review of our accomplishments so far Thoughts on component coupling Topics for future work. C. C. C. C. C.

toan
Download Presentation

Opportunities in Parallel I/O for Scientific Data Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory Opportunities in Parallel I/O for Scientific Data Management

  2. Outline • Brief review of our accomplishments so far • Thoughts on component coupling • Topics for future work

  3. C C C C C PVFS PVFS PVFS PVFS PVFS Comm. Network IOS IOS IOS IOS PVFS2 • Collaborative effort between ANL, Clemson, Northwestern, Ohio State, etc. • Very successful as a freely available parallel file system for Linux clusters • Also deployed on IBM BG/L • Used by many groups as a research vehicle for implementing new parallel file system concepts • True open source software • Open source development • Tightly coupled MPI-IO implementation (ROMIO) • Forms the basis for higher layers to deliver high performance to applications

  4. PVFS2 Performance

  5. PVFS2 Performance Time to Create Files Through MPI-IO

  6. PnetCDF • Parallel version of the popular netCDF library • Major contribution of the SDM SciDAC (funded solely by it) • Collaboration between Argonne and Northwestern • Main implementers: Jianwei Li (NW) and Rob Latham (ANL) • Addresses lack of parallelism in serial netCDF without the difficulty of parallelism in HDF • Only minor changes to the standard netCDF API • Being used in many applications

  7. MPI-IO over Logistical Networking (LN) • LN is a technology that many applications are using to move data efficiently over the wide area • Implementing MPI-IO over LN enables applications to access their data directly from parallel programs • We are implementing a new ROMIO ADIO layer for LN (Jonghyun Lee, ANL) • Nontrivial because the LN API is unlike a traditional file system API • Collaboration between Argonne and Univ. of Tennessee Application MPI-IO ADIO LN PVFS UFS Remote Storage Local Storage

  8. Fruitful Collaborations • Key to our successes in this SciDAC have been strong collaborations with other participants in the Center • Northwestern University • PnetCDF, PVFS2 • Jianwei, Avery, Kenin, Alok, Wei-keng • ORNL • Nagiza’s group • MPI-IO and PnetCDF for visualization (parallel VTK) • LBNL • Ekow (MPI-IO on SRM) • Ongoing collaboration with Univ. of Tennessee for MPI-IO/LN

  9. Component Coupling via Standard Interfaces • We believe that well-defined standard APIs are the right way to couple different components of the software stack • Having the right API at each level is crucial for performance Application HDF-5 PnetCDF MPI-IO Lustre GPFS PVFS

  10. Topics for Future Work

  11. Guiding Theme • How can we cater better to the needs of SciDAC applications?

  12. Make Use of Extended Attributes on Files • PVFS2 now allows users to store extended attributes along with files • Also available in local Linux file systems, so a standard is emerging • This feature has many applications: • Store metadata for high-level libraries as extended attributes instead of directly in the file • avoids the problem of unaligned file accesses • Store MPI-IO hints for persistence • Store provenance information FILE Xattr Name=“Mesh Size” Value=“1K x 1K”

  13. Next Generation High-Level Library • HDF and netCDF were written 15-20 years ago as serial libraries • Explore the possibility of designing a new high-level library that is explicitly built as a parallel library for modern times • What features are needed? • Can we exploit extended attributes? • Can the data span multiple files instead of one file, with a directory as the object? • What is the portable file format? • New, more efficient implementation techniques

  14. Implement Using Combination of Database and Parallel I/O • Use a real database to store metadata and a parallel file system to store actual data • Flexible and high performance • Powerful search and retrieval capability • Prototype implemented in 1999-2000 (published in SC2000 and JPDC) • Jaechun No, Rajeev Thakur, Alok Choudhary • Needs more work; collaboration with application scientists • Serializability/portability of data is a challenge • What is the right API for this? Application SDM Data Metadata MPI-IO Database Berkeley DB, Oracle, DB2 Parallel file system

  15. Parallel File System Improvements • Autonomic • Self-tuning, self-mantaining, self-healing • Fault tolerant • Tolerate server failures • Scalability • Ten to hundred-thousand clients • Active storage • Run operations on the server, such as data reduction, filtering, transformation

  16. End-to-End Data and Performance Management • Applications run and write data at one site (say NERSC) • Scientists need to access the data at their home location, which is geographically distant • Need high-performance and management of this whole process • We intend to focus on ensuring that our “local access” tools (PVFS, MPI-IO, PnetCDF) integrate well with other tools that access data over the wide area (SRM, Logistical Networking, Gridftp)

  17. Summary • Despite progress on various fronts, managing scientific data continues to be a challenge for application scientists • We plan to continue to tackle the important problems by • focusing on our strengths in the areas of parallel file systems and parallel I/O • collaborating with other groups doing complementary work in other areas to ensure that our tools integrate well with theirs

More Related