Opportunities in Parallel I/O for Scientific Data Management

Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory Opportunities in Parallel I/O for Scientific Data Management

Outline • Brief review of our accomplishments so far • Thoughts on component coupling • Topics for future work

C C C C C PVFS PVFS PVFS PVFS PVFS Comm. Network IOS IOS IOS IOS PVFS2 • Collaborative effort between ANL, Clemson, Northwestern, Ohio State, etc. • Very successful as a freely available parallel file system for Linux clusters • Also deployed on IBM BG/L • Used by many groups as a research vehicle for implementing new parallel file system concepts • True open source software • Open source development • Tightly coupled MPI-IO implementation (ROMIO) • Forms the basis for higher layers to deliver high performance to applications

PVFS2 Performance

PVFS2 Performance Time to Create Files Through MPI-IO

PnetCDF • Parallel version of the popular netCDF library • Major contribution of the SDM SciDAC (funded solely by it) • Collaboration between Argonne and Northwestern • Main implementers: Jianwei Li (NW) and Rob Latham (ANL) • Addresses lack of parallelism in serial netCDF without the difficulty of parallelism in HDF • Only minor changes to the standard netCDF API • Being used in many applications

MPI-IO over Logistical Networking (LN) • LN is a technology that many applications are using to move data efficiently over the wide area • Implementing MPI-IO over LN enables applications to access their data directly from parallel programs • We are implementing a new ROMIO ADIO layer for LN (Jonghyun Lee, ANL) • Nontrivial because the LN API is unlike a traditional file system API • Collaboration between Argonne and Univ. of Tennessee Application MPI-IO ADIO LN PVFS UFS Remote Storage Local Storage

Fruitful Collaborations • Key to our successes in this SciDAC have been strong collaborations with other participants in the Center • Northwestern University • PnetCDF, PVFS2 • Jianwei, Avery, Kenin, Alok, Wei-keng • ORNL • Nagiza’s group • MPI-IO and PnetCDF for visualization (parallel VTK) • LBNL • Ekow (MPI-IO on SRM) • Ongoing collaboration with Univ. of Tennessee for MPI-IO/LN

Component Coupling via Standard Interfaces • We believe that well-defined standard APIs are the right way to couple different components of the software stack • Having the right API at each level is crucial for performance Application HDF-5 PnetCDF MPI-IO Lustre GPFS PVFS

Topics for Future Work

Guiding Theme • How can we cater better to the needs of SciDAC applications?

Make Use of Extended Attributes on Files • PVFS2 now allows users to store extended attributes along with files • Also available in local Linux file systems, so a standard is emerging • This feature has many applications: • Store metadata for high-level libraries as extended attributes instead of directly in the file • avoids the problem of unaligned file accesses • Store MPI-IO hints for persistence • Store provenance information FILE Xattr Name=“Mesh Size” Value=“1K x 1K”

Next Generation High-Level Library • HDF and netCDF were written 15-20 years ago as serial libraries • Explore the possibility of designing a new high-level library that is explicitly built as a parallel library for modern times • What features are needed? • Can we exploit extended attributes? • Can the data span multiple files instead of one file, with a directory as the object? • What is the portable file format? • New, more efficient implementation techniques

Implement Using Combination of Database and Parallel I/O • Use a real database to store metadata and a parallel file system to store actual data • Flexible and high performance • Powerful search and retrieval capability • Prototype implemented in 1999-2000 (published in SC2000 and JPDC) • Jaechun No, Rajeev Thakur, Alok Choudhary • Needs more work; collaboration with application scientists • Serializability/portability of data is a challenge • What is the right API for this? Application SDM Data Metadata MPI-IO Database Berkeley DB, Oracle, DB2 Parallel file system

Parallel File System Improvements • Autonomic • Self-tuning, self-mantaining, self-healing • Fault tolerant • Tolerate server failures • Scalability • Ten to hundred-thousand clients • Active storage • Run operations on the server, such as data reduction, filtering, transformation

End-to-End Data and Performance Management • Applications run and write data at one site (say NERSC) • Scientists need to access the data at their home location, which is geographically distant • Need high-performance and management of this whole process • We intend to focus on ensuring that our “local access” tools (PVFS, MPI-IO, PnetCDF) integrate well with other tools that access data over the wide area (SRM, Logistical Networking, Gridftp)

Summary • Despite progress on various fronts, managing scientific data continues to be a challenge for application scientists • We plan to continue to tackle the important problems by • focusing on our strengths in the areas of parallel file systems and parallel I/O • collaborating with other groups doing complementary work in other areas to ensure that our tools integrate well with theirs

Opportunities in Parallel I/O for Scientific Data Management

Opportunities in Parallel I/O for Scientific Data Management

Presentation Transcript

Presentation 2.2: Opportunities Realized Through Interface Forest Management

Basic Library/Learning Resource Center Management

FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT

CDISC SDTM et Data Management Isabelle Abousahl Head of Data Management Elisabeth Campain-Teulon Data Warehouse Manage

DATA MANAGEMENT FOR THE ALL-DOD CORE ARCHITECTURE DATA MODEL (All_CADM)

Introduction to Parallel I/O and MPI-IO

Parallel Computing Explained Parallel Computing Overview

An Introduction to Big Data Ken Smith

CS 484 Parallel Programming spring 2014

Parallel and Distributed Algorithms

Metabolic changes in transition cows – opportunities for management

Analytical Modeling of Parallel Systems

Dr. Yukun Bao School of Management, HUST

Parallel Architecture is Ubiquitous

Parallel HDF5 Tutorial

MMDSS 2007 Data stream management and mining

Le modèle BSP B ulk- S ynchronous P arallel

The basic of parallel programming

Product Data Management (PDM) Engineering Data Management (EDM)

Preview

PARALLEL COMPUTING WITH MPI

Exploiting Diverse Sources of Scientific Data the vision, what has been achieved and what next…