1 / 30

What will be new in HDF5?

What will be new in HDF5?. HDF5 Road Map. Performance. Ease of use. Robustness. Innovation. Outline. Performance improvements Fortran 2003 features HDF5 file recover (metadata journaling). Performance Improvements. Performance Improvements in HDF5. Examples of completed work:

sondra
Download Presentation

What will be new in HDF5?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What will be new in HDF5? HDF and HDF-EOS Workshop XII

  2. HDF5 Road Map Performance Ease of use Robustness Innovation HDF and HDF-EOS Workshop XII

  3. Outline • Performance improvements • Fortran 2003 features • HDF5 file recover (metadata journaling) HDF and HDF-EOS Workshop XII

  4. Performance Improvements HDF and HDF-EOS Workshop XII

  5. Performance Improvements in HDF5 • Examples of completed work: • New implementation of metadata cache to improve I/O performance and memory usage when accessing many objects (HDF5 1.8.0) • Faster, more scalable storage and access for large groups (HDF5 1.8.0) HDF and HDF-EOS Workshop XII

  6. Performance Improvements in HDF5 • Work in progress • New implementation of free-space management • Affects “dynamic” applications that add/delete/modify existing objects • When creating HDF5 objects, space is allocated from available space tracked by the free-space manager • When deleting objects, unused space is added to the free-space pool via free-space manager • Current implementation uses O(N2) operations for each N allocations or freeing space • New implementation O(log2N) HDF and HDF-EOS Workshop XII

  7. Free-space Management • Test: creating/deleting attributes • Create first set of attributes • Delete odd-numbered attributes from the first set • Create second set of attributes • Delete all attributes from the second set HDF and HDF-EOS Workshop XII

  8. Performance Improvements in HDF5 • Work in progress • Fast data append (along slowest changing dimension) • Future areas of interest • Efficient chunking cache implementation (NetCDF4) • Efficient handling of the variable-length data including compression (will affect sizes of NPOESS files) HDF and HDF-EOS Workshop XII

  9. Fortran 2003 features HDF and HDF-EOS Workshop XII

  10. Status of the HDF5 Fortran Library • HDF5 Fortran library is a part of standard HDF5 distribution • First release goes back to 1999 • Implemented as Fortran90 wrappers on top of the HDF5 C library • Supported on Linux, Windows, Mac Intel, Solaris, VMS, clusters, etc. • Compilers • Open source gfortran, g95 • Vendors (SUN, Intel, PGI, Absoft) • 32 and 64-bit versions HDF and HDF-EOS Workshop XII

  11. HDF5 Fortran Library • Mimics HDF5 C APIs • Fortran 90 features used • Modules • Function overloading • Function interfaces • Dynamic memory allocation • Optional parameters • Many Fortran APIs are simpler than their C counterparts HDF and HDF-EOS Workshop XII

  12. Current Limitations • Supports only “native” Fortran types such as • INTEGER • REAL • CHARACTER • DOUBLE PRECISION (obsolete Fortran feature) • No support for INTEGER*1, INTEGER*2, INTEGER*8, INTEGER*16, REAL*8, REAL*16 • Cannot write/read buffers of those types • Fortran types have to match C types • No support for –r8 and –r16 flags (Fortran real =/= C float) HDF and HDF-EOS Workshop XII

  13. Current Limitations • No support for INTEGER(KIND=n) and REAL(KIND=m) • Integers n and m are called KIND parameters • Returned by selected _int_kind (r) -10r < n < 10r selected_real_kind(p,r) p – precision r – decimal exponent range HDF and HDF-EOS Workshop XII

  14. Current Limitations • Limited support for derived types (compare with C structures and HDF5 compound datatypes) • Supports derived types with “native” fields only • Doesn’t support complex HDF5 datatypes (e.g., with array member, or nested compound) • Writes/reads HDF5 compound datasets by fields only • Cannot be used in parallel applications • No support for enum types • No support for callback functions HDF and HDF-EOS Workshop XII

  15. Current Limitations - Summary • Any application written according to Fortran 95/2003 standard will struggle using HDF5 Fortran Library • Many HDF5 features are not available to Fortran applications HDF and HDF-EOS Workshop XII

  16. Fortran 2003 Features • Fortran 2003 provides a standardized mechanism for interoperating with C • Module ISO_C_BINDING for interoperability of intrinsic types • C_PTR and C_FUNCPTR for interoperability with C pointers • C_LOC(x) and C_FUNLOC(x) inquiry functions for getting addresses of variables and procedures • BIND attribute for interoperability of derived types and C structures and enumerated types HDF and HDF-EOS Workshop XII

  17. Fortran 2003 Features and HDF5 • New 2003 features allowed us to support • Any Fortran INTEGER and REAL type data in HDF5 files • Fortran derived types and HDF5 compound datatypes • Fortran enumerated types and HDF5 enumerated types • HDF5 APIs with callbacks HDF and HDF-EOS Workshop XII

  18. Fortran Compilers with 2003 Features HDF and HDF-EOS Workshop XII

  19. Example PROGRAM main USE HDF5 USE ISO_C_BINDING ….. TYPE s1_t CHARACTER(LEN=1), DIMENSION(1:13) :: chr INTEGER(KIND=int_k1) :: a REAL(KIND=r_k4) :: b REAL(KIND=r_k8) :: c END TYPE s1_t TYPE(s1_t), TARGET:: s1(LENGTH) HDF and HDF-EOS Workshop XII

  20. Example CALL h5tarray_create_f(H5T_NATIVE_CHARACTER, 1, tdims1, …) CALL h5tinsert_f(s1_tid, "chr_name", H5OFFSETOF(C_LOC(s1(1)),C_LOC(s1(1)%chr)),tid3, hdferr) CALL h5tinsert_f(s1_tid, "a_name", H5OFFSETOF(C_LOC(s1(1)),C_LOC(s1(1)%a)), h5kind_to_type(int_k1,H5_INTEGER_KIND), hdferr) ….. CALL h5dcreate_f(file, DATASETNAME, s1_tid, space, …) ….. f_ptr = C_LOC(s1) CALL h5dwrite_f(dataset, s1_tid, f_ptr, hdferr) HDF and HDF-EOS Workshop XII

  21. Example HDF5 "SDScompound.h5" { GROUP "/" { DATASET "ArrayOfStructures" { DATATYPE H5T_COMPOUND { H5T_ARRAY { [13] H5T_STRING { STRSIZE 1; STRPAD H5T_STR_SPACEPAD; CSET H5T_CSET_ASCII; CTYPE H5T_C_S1; } } "chr_name"; H5T_STD_I8LE "a_name"; H5T_IEEE_F64LE "c_name"; H5T_IEEE_F32LE "b_name"; } ….. HDF and HDF-EOS Workshop XII

  22. Information • Source code location http://svn.hdfgroup.uiuc.edu/hdf5/branches/F2003 ./configure –enable-fortran2003 • See examples for new features • Send comments to epourmal@hdfgroup.org • Documentation will be available in November 2008 HDF and HDF-EOS Workshop XII

  23. HDF5 file recovery orSurviving a System Failure through Metadata Journaling HDF and HDF-EOS Workshop XII 23

  24. Surviving a System Failure in HDF5 Problem: Data in an opened HDF5 files susceptible to corruption in the event of an application or system crash. Corruption possible if an opened HDF5 file has been updated when the crash occurs. Initial Objective: Guarantee an HDF5 file with consistent metadata can be reconstructed in the event of a crash. No guarantee on state of raw data – contains whatever made it to disk prior to crash. HDF and HDF-EOS Workshop XII 24

  25. Crash Survivability in HDF5 Approach: Metadata Journaling When an HDF5 file is opened with Metadata Journaling enabled, a companion Journal file is created. When an HDF5 API function that modifies metadata is completed, a transaction is recorded in the Journal file.  If the application crashes, a recovery program can replay the journal by applying in order all metadata writes until the end of the last completed transaction written to the journal file. HDF and HDF-EOS Workshop XII 25

  26. HDF5 Metadata Journaling Recovery Application crashed h5recover tool RestoredHDF5 File liFe Corrupted 5DFH Companion Journal File HDF and HDF-EOS Workshop XII

  27. Implementation Status Serial HDF5 with synchronous write mode Alpha1 released August 2008 User interface (API definition and h5recover tool) and file format may change HDF and HDF-EOS Workshop XII 27

  28. Metadata Journaling Plans Serial HDF5 with synchronous write mode Finalize User interface definitions and file format Serial HDF5 with asynchronous write mode To improve Journal file write speed More features (need funding) Make raw data operations atomic Allow "super‐transactions" to be created by applications Enable journaling for Parallel HDF5 HDF and HDF-EOS Workshop XII 28

  29. Questions? HDF and HDF-EOS Workshop XII

  30. Acknowledgement • This report is based upon work supported in part by a Cooperative Agreement with the National Aeronautics and Space Administration (NASA) under NASA Awards NNX06AC83A and NNX08AO77A. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Aeronautics and Space Administration. HDF and HDF-EOS Workshop XII

More Related