250 likes | 394 Views
Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression. ‡. Tekin Bicer , Jian Yin and Gagan Agrawal Ohio State University Pacific Northwest National Laboratory. ‡. Introduction. Increasing parallelism in HPC systems
E N D
Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression ‡ TekinBicer, Jian Yin and GaganAgrawalOhio State UniversityPacific Northwest National Laboratory ‡
Introduction • Increasing parallelism in HPC systems • Large-scale scientific simulations and instruments • Scalable computational throughput • Limited I/O performance • Example: • PRACE-UPSCALE: • 2 TB per day; expectation:10-100PB per day • Higher precision. i.e. more computation and data • “Big Compute” Opportunities → “Big Data” Problems • Large volume of output data • Data read and analysis • Storage, management, and transfer of data • Compression
Introduction (cont.) • Community focus • Storing, managing and moving scientific dataset • Compression can further help • Decreased amount of data • Increased I/O throughput • Better data transfer • Increased simulation and data analysis performance • But… • Can it really benefit the application execution? • Tradeoff between CPU utilization and I/O idle time • What about integration with scientific applications? • Effort required by scientists to adapt their application
Scientific Data Management Libs. • Widely used by the community • PnetCDF (NetCDF), HDF5… • NetCDF Format • Portable, self-describing, space-efficient • High Performance Parallel I/O • MPI-IO • Optimizations: Collective and Independent calls • Hints about file system • No Support for Compression
Parallel and Transparent Compression for PnetCDF • Parallel write operations • Size of data types and variables • Data item locations • Parallel write operations with compression • Variable-size chunks • No priori knowledge about the locations • Many processes write at once
Parallel and Transparent Compression for PnetCDF Desired features while enabling compression: • Parallel Compression and Write • Sparse and Dense Storage • Transparency • Minimum effort from application developer • Integration with PnetCDF • Performance • Different variable may require different compression • Domain specific compression algorithm
Outline • Introduction • Scientific Data Management Libraries • PnetCDF • Compression Approaches • A Compression Methodology • System Design • Experimental Result • Conclusion
Compression: Sparse Storage • Chunks/splits are created • Compression layer applies user provided algs. • Compressed splits are written w/ orig. offset addr. • Still can benefit I/O • Only compressed data • No benefit for storage space
Compression: Dense Storage • Generated compressed splits are appended locally • Net offset addresses are calculated • Requires metadata exchange • All compressed data blocks written using collective call • Generated file is smaller • Advantages: I/O + storage space
Compression: Hybrid Method • Developer provides: • Compression ratio • Error ratio • Does not require metadata exchange • Error padding can be used for overflowed data • Generated file is smaller • Relies on user inputs Off’ = Off x (1/(comp_ratio-err_ratio)
System API • Complexity of scientific data management libs. • Trivial changes in scientific applications • Requirement of a system API: • Defining compression function • comp_f (input, in_size, output, out_size, args) • Defining decompression function • decomp_f (input, in_size, output, out_size, args) • Registering user defined functions • ncmpi_comp_reg (*comp_f, *decomp_f, args, …)
Compression Methodology • Common properties of scientific datasets • Consist of floating point numbers • Relationship between neighboring values • Generic compression cannot perform well • Domain specific solutions can help • Approach: • Differential compression • Predict the values of neighboring cells • Store the difference
Example: GCRM Temperature Variable Compression • E.g.: Temperature record • The values of neighboring cells are highly related • X’ table (after prediction): • X’’ compressed values • 5bits for prediction + difference • Lossless and lossy comp. • Fast and good compression ratios
PnetCDF Data Flow • Generated data is passed to PnetCDF lib. • Variable info. gathered from NetCDF header • Splits are compressed • User defined comp. alg. • Metadata info. exchanged • Parallel write ops. • Synch. and global view • Update NetCDF header
Outline • Introduction • Scientific Data Management Libraries • PnetCDF • Compression Approaches • A Compression Methodology • System Design • Experimental Result • Conclusion
Experimental Setup • Local cluster: • Each node has 8 cores (Intel Xeon E5630, 2.53Ghz) • Memory: 12GB • Infiniband network • Lustre file system: 8 OSTs, 4 storage nodes • 1 Metadata Sert • Microbenchmarks: 34 GB • Two data analysis applications: 136 GB dataset • AT, MATT • Scientific simulation application: 49 GB dataset • Mantevo Project: MiniMD
Exp: Simulation (MiniMD) Application Execution Times Application Write Times
Conclusion • Scientific data analysis and simulation app. • Deal with massive amount of data • Management of “Big Data” • I/O throughput affects performance • Need for transparent compression • Minimum effort during integration • Proposed two compression methods • Implemented a compression layer in PnetCDF • Ported our proposed methods • Scientific data compression alg. • Evaluated our system • MiniMD: 22% performance, 25.5% storage space • AT, MATT: 45.3% performance, 47.8% storage space
Exp: Microbenchmarks • Dataset size: 34GB • Timestep: 270MB • Comp.: 17.7GB • Timestep: 142MB • Chunk size: 32MB • # Processes: 64 • Strip count: 8 Comparing Write Times with Varying Stripe Sizes
Outline • Introduction • Scientific Data Management Libraries • PnetCDF • Compression Approaches • A Compression Methodology • System Design • Experimental Result • Conclusion