200 likes | 213 Views
Learn about the HDF Product Designer, a tool that facilitates the creation of interoperable and standards-compliant data products in HDF5. Explore features such as conventions, metadata, software integration, and collaborative design workflow.
E N D
HDF Product Designer Aleksandar Jelenak, H. Joe Lee, Ted Habermann Gerd Heber, John Readey, Joel Plutchak The HDF Group HDF Workshop @ 2015 ESIP Summer Meeting
Data Producer’s Conundrum Interoperability HDF Features • Conventions • Metadata • Software • netCDF • Datatypes • Groups • Attributes • Dimension scales • Compression • Chunking • Scale/offset • Etc. HDF Product Designer Project Requirements • Science objectives • Data processing, discovery & distribution • Data documentation • User engagement, preparedness, feedback HDF Workshop @ 2015 ESIP Summer Meeting
Brief History • Original idea from Jeffrey Lee, who developed HDF5 Earth Science Builder/Creator toolset for the ICESat-2 mission • A similar tool was independently developed for the SMAP mission • The HDF Group was asked to generalize the concept • The outcome: HDF Product Designer HDF Workshop @ 2015 ESIP Summer Meeting
Key Goals • Facilitate creation of interoperable and standards-compliant data products in HDF5 as early as possible in the project development process • Support multiple computing platforms without requiring the full software stack of development tools and libraries • Easy and intuitive editing (create, update, move, copy, delete) of HDF5 objects • Collaborative approach to product design (project, team, organization) • Incorporation of best practices and standards from targeted data user communities • Integration of compliance and interoperability tests into the design workflow • Content import from existing files • Content export as HDF5 files, HDF5/JSON, or as source code in several programming languages HDF Workshop @ 2015 ESIP Summer Meeting
System Architecture HDF5 JSON HDF4 MAPXML NcML HDF5 Server Data Store RESTful Server Desktop Client Flexible Output HDF5 JSON MATLAB IDL Python HDF5 File Template CSV (Excel) Fortran HDF Workshop @ 2015 ESIP Summer Meeting
Software Stack • Desktop Client • WxPython • CLIPS (C Language Integrated Production System) expert system • PyCLIPS • RESTful Server • Python/Tornado • h5py • Data Store • PostgreSQL relational database HDF Workshop @ 2015 ESIP Summer Meeting
Features • Projects • Designs • CRUD operations on HDF5 objects • Conventions support • Validation services • Collaborative workflow HDF Workshop @ 2015 ESIP Summer Meeting
Project • Organizational and collaborative space • One or more users • Zero or more designs • Every user must belong to at least one project • All members of a project has access to its designs • User project roles: • Manager (not used yet) • Designer • Value Editor (not used yet) • Viewer HDF Workshop @ 2015 ESIP Summer Meeting
Design • Represents content to be stored in one HDF5 file • Not actual HDF5 file • Versioned • Simple timeline of checkpoints (saved versions) • Each version must have unique label • Only the current working version (label: HEAD) can be edited • Import from: NcML (netCDF XML), HDF4 file content map (XML), HDF5/JSON • Export as: HDF5 template, HDF5/JSON; source code: Python, MATLAB, IDL, FORTRAN HDF Workshop @ 2015 ESIP Summer Meeting
CRUD Operations • CRUD = create, read, update, delete, copy, move • Available on designs and HDF5 objects • Support for HDF5 dimension scales continuously improves • Properties available to edit: • Datatype • Rank, shape, max/unlimited dimension sizes • Storage (compact, contiguous, chunked) • Fill value • Compression • Attribute value HDF Workshop @ 2015 ESIP Summer Meeting
Conventions • Supported: • NetCDF User Guide Attribute Conventions (NUG) • Attribute Convention for Data Discovery (ACDD) • Climate and Forecast convention (CF) • HDF-EOS (partial) • Implemented using the CLIPS expert system HDF Workshop @ 2015 ESIP Summer Meeting
Validation Services • A set of online services for interoperability testing • The level of support for conventions varies between different software tools so it is important to verify using actual file • Input is HDF5 template file • Output is typically displayed in a web browser HDF Workshop @ 2015 ESIP Summer Meeting
Validation Services • Currently available: • netCDF CDL • Get as netCDF3 file • CF (NCO’s ncdismember) • ACDD (THREDDS UDDC service) • ISO metadata (THREDDS ISO service) • OPeNDAP Data Access Form • THREDDS Dataset Access Page HDF Workshop @ 2015 ESIP Summer Meeting
Collaboration Individuals Teams Projects Programs HDF Workshop @ 2015 ESIP Summer Meeting
Collaboration • Design • (Desktop) • Share (Server) • Publish (Online) HDF Workshop @ 2015 ESIP Summer Meeting
User Resources • User Guide • Code is hosted in the NASA EarthdataCode Collaborative • Mailing list • Regular monthly meetings • Us! HDF Workshop @ 2015 ESIP Summer Meeting
Future Work? • Continue improving user interface, source code generators, … • Adding data to HDF5 templates for further validation tests • Generate Word-friendly product description to help with preparing required project documentation • Whole file convention compliance checks • User feedback always welcome and can influence planning! HDF Workshop @ 2015 ESIP Summer Meeting
Thank you! Questions? Contact: ajelenak@hdfgroup.org This work was supported under the NASA Earth Observing System Data and Information Systems (EOSDIS) Evolution and Development (EED) Program under prime contract number NNG10HP02C. Any opinions, findings, or conclusions expressed in this material are those of the author and do not necessarily reflect the views of NASA. HDF Workshop @ 2015 ESIP Summer Meeting
This work was supported by NASA/GSFC under Raytheon Co. contract number NNG10HP02C HDF Workshop @ 2015 ESIP Summer Meeting