170 likes | 271 Views
Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory. Opportunities in Parallel I/O for Scientific Data Management. Outline. Brief review of our accomplishments so far Thoughts on component coupling Topics for future work. C. C. C. C. C.
E N D
Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory Opportunities in Parallel I/O for Scientific Data Management
Outline • Brief review of our accomplishments so far • Thoughts on component coupling • Topics for future work
C C C C C PVFS PVFS PVFS PVFS PVFS Comm. Network IOS IOS IOS IOS PVFS2 • Collaborative effort between ANL, Clemson, Northwestern, Ohio State, etc. • Very successful as a freely available parallel file system for Linux clusters • Also deployed on IBM BG/L • Used by many groups as a research vehicle for implementing new parallel file system concepts • True open source software • Open source development • Tightly coupled MPI-IO implementation (ROMIO) • Forms the basis for higher layers to deliver high performance to applications
PVFS2 Performance Time to Create Files Through MPI-IO
PnetCDF • Parallel version of the popular netCDF library • Major contribution of the SDM SciDAC (funded solely by it) • Collaboration between Argonne and Northwestern • Main implementers: Jianwei Li (NW) and Rob Latham (ANL) • Addresses lack of parallelism in serial netCDF without the difficulty of parallelism in HDF • Only minor changes to the standard netCDF API • Being used in many applications
MPI-IO over Logistical Networking (LN) • LN is a technology that many applications are using to move data efficiently over the wide area • Implementing MPI-IO over LN enables applications to access their data directly from parallel programs • We are implementing a new ROMIO ADIO layer for LN (Jonghyun Lee, ANL) • Nontrivial because the LN API is unlike a traditional file system API • Collaboration between Argonne and Univ. of Tennessee Application MPI-IO ADIO LN PVFS UFS Remote Storage Local Storage
Fruitful Collaborations • Key to our successes in this SciDAC have been strong collaborations with other participants in the Center • Northwestern University • PnetCDF, PVFS2 • Jianwei, Avery, Kenin, Alok, Wei-keng • ORNL • Nagiza’s group • MPI-IO and PnetCDF for visualization (parallel VTK) • LBNL • Ekow (MPI-IO on SRM) • Ongoing collaboration with Univ. of Tennessee for MPI-IO/LN
Component Coupling via Standard Interfaces • We believe that well-defined standard APIs are the right way to couple different components of the software stack • Having the right API at each level is crucial for performance Application HDF-5 PnetCDF MPI-IO Lustre GPFS PVFS
Guiding Theme • How can we cater better to the needs of SciDAC applications?
Make Use of Extended Attributes on Files • PVFS2 now allows users to store extended attributes along with files • Also available in local Linux file systems, so a standard is emerging • This feature has many applications: • Store metadata for high-level libraries as extended attributes instead of directly in the file • avoids the problem of unaligned file accesses • Store MPI-IO hints for persistence • Store provenance information FILE Xattr Name=“Mesh Size” Value=“1K x 1K”
Next Generation High-Level Library • HDF and netCDF were written 15-20 years ago as serial libraries • Explore the possibility of designing a new high-level library that is explicitly built as a parallel library for modern times • What features are needed? • Can we exploit extended attributes? • Can the data span multiple files instead of one file, with a directory as the object? • What is the portable file format? • New, more efficient implementation techniques
Implement Using Combination of Database and Parallel I/O • Use a real database to store metadata and a parallel file system to store actual data • Flexible and high performance • Powerful search and retrieval capability • Prototype implemented in 1999-2000 (published in SC2000 and JPDC) • Jaechun No, Rajeev Thakur, Alok Choudhary • Needs more work; collaboration with application scientists • Serializability/portability of data is a challenge • What is the right API for this? Application SDM Data Metadata MPI-IO Database Berkeley DB, Oracle, DB2 Parallel file system
Parallel File System Improvements • Autonomic • Self-tuning, self-mantaining, self-healing • Fault tolerant • Tolerate server failures • Scalability • Ten to hundred-thousand clients • Active storage • Run operations on the server, such as data reduction, filtering, transformation
End-to-End Data and Performance Management • Applications run and write data at one site (say NERSC) • Scientists need to access the data at their home location, which is geographically distant • Need high-performance and management of this whole process • We intend to focus on ensuring that our “local access” tools (PVFS, MPI-IO, PnetCDF) integrate well with other tools that access data over the wide area (SRM, Logistical Networking, Gridftp)
Summary • Despite progress on various fronts, managing scientific data continues to be a challenge for application scientists • We plan to continue to tackle the important problems by • focusing on our strengths in the areas of parallel file systems and parallel I/O • collaborating with other groups doing complementary work in other areas to ensure that our tools integrate well with theirs