380 likes | 619 Views
HDF. HDF Update. Mike Folk National Center for Supercomputing Applications HDF and HDF-EOS Workshop VIII October 27, 2004. Topics. HDF Team and Supporters HDF software update Other Activities of Interest. The HDF Team. Xuan Bai Frank Baker Peter Cao Vailin Choi Mike Folk
E N D
HDF HDF Update Mike Folk National Center for Supercomputing Applications HDF and HDF-EOS Workshop VIII October 27, 2004
Topics • HDF Team and Supporters • HDF software update • Other Activities of Interest
The HDF Team Xuan Bai Frank Baker Peter Cao Vailin Choi Mike Folk Barbara Jones Quincey Koziol James Laird Raymond Lu John Mainzer Robert McGrath Pedro Nunes Elena Pourmal Binh-minh Ribler Eric Shapiro Rishi Sinha Kent Yang And all those wonderful folks out there who contribute ideas, requests, bug reports, code, and support.
Organization HDF Project • Staff breakdown • User support, documentation • QA, maintenance, testing • Software development • System administration • Management • See Thursday tutorial on HDF Software Process Support, doc, QA, maintenance Basic library development Tools and Java Parallel I/O, Grid, big machines
Who is supporting HDF? • Organizations and communities with institutional and financial commitment to HDF • NCSA, NASA, State of IL, DOE, Boeing • Agencies supporting R&D • NCSA, NASA, NARA, DOE, NSF, ONR • Collaborators who make in-kind contributions • Cactus, PyTables, NeXUS, CGNS, many others
HDF software milestones in FY 2004 HDF 4.2r0 HDF5 1.6.2 HDF5 Java 2.0HDF5 High Level Flexible parallel HDF5 (Alpha) HDF5 1.6.3 Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct 2003 2004
HDF4.2 Release 0 – Dec. 2003 • Bug fixes • New features • Support for new platforms and compilers
HDF4.2r0New Features • Tools (per DAAC and Instrument Team requests) • hdfimport – converts float/integer data to SDS/raster • Replaces fp2hdf • Hdiff – compares two HDF4 files • Revision of earlier hdfdiff tool • Hrepack – makes a copy of an HDF4 file • optionally rewrite objects with compression, chunking, etc. • h4cc, h4fc, h4redeploy • Helper scripts to facilitate compilation and installation
HDF4.2r0New Features • Szip compression • Fast compression method • Available on all platforms except Crays • NCSA distributes Szip source and binaries • HDF Library binaries come with SZIP enabled • SZIP Documentation available from http://hdf.ncsa.uiuc.edu/SZIP
HDF4.2r0New Configuration • Addressing key needs • Porting to new platforms • New versions of JPEG and ZLIB libraries • Optional SZIP compression • Many features were hard coded, but could be done at configuration time
HDF4.2r0New Compilers and Platforms • New compilers • Intel C and Fortran • Portland Group Compilers (C only for now) • New OS • Mac OSX • RedHat 8/9 • AIX 5.1 64-bit • OSF1 • Linux 64 (SuSE and RH8) (JPL machines) • Altix (Aura Team)
HDF5 1.6.2 – Feb. 2004 • New functions • better user control over open/close objects • Bug fixes • Parallel improvements • h5pcc, h5pfc helper scripts for parallel compiles • Configure improvements • Improved parallel performance • Speed improvements of data conversion routines • Some SZIP improvements
HDF5 1.6.2 • Support for new compilers and platforms • IBM Fortran on MacOS X • Support for gcc 3.3.4 • Linux 64 (SuSE and RH) at JPL • Altix (Aura team) including parallel C and Fortran Libraries • Investigated SX-6 (NEC) port
HDF5 1.6.3 – Oct. 2004 • Windows • Improvements to the build, test, and installation • New API routines • H5Fget_filesize. Returns size of opened file. • New: H5Fget_name. Returns name of file by object ID • Some F90 and C++ routines added
HDF5 1.6.3 • Utilities • H5repack utility (new) • Regenerates an HDF5 file from another HDF5 file, • Optionally applies filters, chunking to new file • H5dump utility improvements • Print new info, such as dataset filters, storage layout, fill value info
Szip in HDF5 1.6.3 • HDF5 can now include SZIP compression with or without Szip's encoder • Required to create SZIP compressed files • Not required to read SZIP compressed files • Info on Szip and Szip licensing: • http://hdf.ncsa.uiuc.edu/doc_resource/SZIP/
HDF5 1.6.3 New platforms & compilers • PGI Fortran for Linux64 (x86-64) • Absoft F95 for Linux 2.4 -32 bit • IBM XL Fortran and Absoft F95 for Mac OS X
HDF Java Products 2.0 – March 2004 • Tested with HDF5-1.6.2 • Platforms • Windows (98/NT/2000/XP) • Solaris • Linux • AIX • IRIX 6.5 • Mac OSX • OSF1 • http://hdf.ncsa.uiuc.edu/hdf-java-html/
Modular HDFView Modular HDFView – improved HDFView where I/O and GUI components are replaceable modules. • Replaceable modules: • File I/O (file/data format) • Tree view (show file structure) • Table view (spreadsheet-like) • Text view (view/edit text dataset) • Image view (view/process image) • Palette view (view/change palette) • Metadata (attribute) view • http://hdf.ncsa.uiuc.edu/hdf-java-html/hdfview/ Application (HDFView) Interfaces I/O, TreeView, TableView, etc Default Implementation User Implementation
HDFView Web Browser Plug-in • Goal: Click-and-view HDF files remotely and locally from popular web browsers. • See poster.
Parallel HDF5 in 2004 • A few performance improvements • MPICH/MPE instrumentation feature added • performance analysis tools for their MPI programs • “Flexible parallel HDF5” programming model • More flexible model for parallel HDF5 • Other options currently under investigation
Parallel HDF5 developments • New parallel platforms supported • Solaris 2.8 (32 & 64 bits) • OSF 5.1 • Cray T3E, SV1, T90 • HPUX 11.0 • FreeBSD
DOE/ASCI* “ASCI provides the integrating simulation and modeling capabilities and technologies needed …for future design assessment and certification of nuclear weapons and their components” • Massively parallel computing and I/O • Complex data models and big data • HDF5 a standard format for ASCI apps * “Advanced Simulation and Computing Program”
BoeingHDF5 for real-time flight test data • Needed for flight test data systems • Must handle raw, real-time data • Implemented API to read/write data • Based on HDF5 “table” API • Challenge: Variable length data • Possible Boeing-wide standard • Potential applications to many domains • See poster
NCASSR*: Indexing & viewing tables • Opportunities arising from Boeing work • Make test-data features widely available • Common data model and API for tabular data in HDF5 • Indexing for post-processing • Viewing capabilities • Tasks • Identify apps to study and gather requirements • Develop data model and API for tabular data • Include general purpose indexing structures and API • Implement prototype API and viewer * National Center for Advanced Secure Systems Research
National Archives and Records Administration (NARA) • Investigate HDF5 as format for records archiving • Focus on geospatial data • Images (e.g. elevation models, aerial photography) • Features (e.g. boundaries, roads, rivers) • Results so far • HDF5 data model handles all data types • Feature (vector) data present access and size challenges • Work is leading to good performance lessons • See poster about study of vector data
SciDAC/PMODELArithmetic Data Transform • Apply algebraic operations to dataset during read/write. • Initial goal: • transform individual elements (e.g., x * 1.8 + 32). • During reads, applies to result in memory. During writes, data in the file changed. • Implemented in HDF5 v1.7, to be released in v1.8 • Future • Transformations on attributes or multiple datasets (e.g. (A + B) / 2.0) • http://hdf.ncsa.uiuc.edu/PMODELS/datatransform/
Weather Research Forecast (WRF) Model • WRF – NCAR community standard model • HDF5 I/O module for NCAR’s WRF • HDF5-WRF parallel I/O studies • Improved performance for computations with large I/O • Sequential HDF5-WRF studies • Compression can save disk space • See the poster • And see http://hdf.ncsa.uiuc.edu/apps/WRF-ROMS
netCDF-HDF Project • Enhanced NetCDF-4 Interface to HDF5 • Combine features of netCDF and HDF5 • Take advantage of their separate strengths • Collaboration between NCSA and Unidata • See poster: “Merging the netCDF and HDF5 libraries to achieve gains in performance and interoperability”
OPeNDAP – netCDF – HDF5 • OPeNDAP • A system for the transmitting data across the Internet • Supports selection of data using constraint expressions • Can translate data from one format to another • NetCDF and HDF5 • Formats of major interest to the OPeNDAP community • All three are in heavy use in the earth sciences • So the question is …
Are the planets finally aligned? HDF5 netCDF To harmonize OPeNDAPnetCDFHDF5? OPeNDAP
OpenDAP/netCDF/HDF5 Harmonization • Opportunity • Unidata is creating netcdf-4 • Existing OPeNDAP work with netcdf and HDF5 • OPeNDAP project working on a new spec (4.0) • John Caron working on new java-netCDF library (2.2) • Creates a "common data model" which is more-or-less a union of the 3 models. • But there are important differences • Different ecological niche • Some very different object types • So a union of all the models is unlikely
OpenDAP/netCDF/HDF5 Harmonization • Goal: map between the three models, and possibly tweak the models to better make them harmonize. • Tackle certain important differences • OPeNDAP Sequences • Hard to represent in the netCDF API • But seems like they might work in HDF5. • HDF5 attributes • Hard to represent in the DAP. • Also perhaps devise a formal mapping between the three models
Thank you Acknowledgements This report is based upon work supported in part by a Cooperative Agreement with NASA under NASA grant NAG 5-2040 and NAG NCCS-599. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Aeronautics and Space Administration. Other support provided by NCSA and other sponsors and agencies. (http://hdf.ncsa.uiuc.edu/acknowledge.html). Made on location in Champaign Illinois. To the best of our knowledge, no animals were abused in the making of these slides.
Information Sources • HDF website • http://hdf.ncsa.uiuc.edu/ • HDF5 Information Center • http://hdf.ncsa.uiuc.edu/HDF5/ • HDF Helpdesk • hdfhelp@ncsa.uiuc.edu • HDF users mailing list • hdfnews@ncsa.uiuc.edu