400 likes | 509 Views
Profile of National Polar-Orbiting Operational Satellite System (NPOESS) HDF5 Files. Chuck Nellis NPOESS Program Aurora, Colorado. Presentation Compiled by:. Kim Tomashosky, Ken Stone, Pat Purcell, Ron Andrews NPOESS Program Aurora, Colorado. Introduction. About NPOESS.
E N D
Profile of National Polar-Orbiting Operational Satellite System (NPOESS) HDF5 Files Chuck Nellis NPOESS Program Aurora, Colorado
Presentation Compiled by: Kim Tomashosky, Ken Stone, Pat Purcell, Ron Andrews NPOESS Program Aurora, Colorado
About NPOESS • The National Polar-orbiting Operational Environmental Satellite System* (NPOESS) is a satellite system used to monitor global environmental conditions, and collect and disseminate data related to: • Weather • Atmosphere • Oceans • Land • Near-space environment • The National Polar-orbiting Operational Environmental Satellite System (NPOESS) will converge existing polar-orbiting satellite systems under a single national program • Polar-orbiting satellites observe Earth from space • They collect and disseminate data on Earth's weather, atmosphere, oceans, land, and near-space environment • The polar orbiters are able to monitor the entire planet and provide data for long-range weather and climate forecasts *http://www.ipo.noaa.gov/
About NPOESS, Continued • Increases the timeliness and accuracy of severe weather event forecasts • Will collect over 50 environmental measurements which are crucial to timely, accurate, weather forecasts by military and civilian organizations. It will enable: • Increased accuracy in severe storm warnings and forecasting • Improved drought analysis and flood warnings • Managed by the tri-agency Integrated Program Office* (IPO) utilizing personnel from the Department of Commerce, Department of Defense, and NASA *http://www.ipo.noaa.gov/
NPOESS Data Products • NPOESS Data Products are distributed, formatted in HDF5 • Archived and made available to the community via the Comprehensive Large Array-data Stewardship System* (CLASS), an electronic library of NOAA environmental data • There is no “HDF-NPOESS” library, NPOESS Data Products have been designed using the native HDF5 library • NPOESS Data Products • Raw Data Records (RDR) • Sensor Data Records (SDR) / Temperature Data Records (TDR) • Intermediate Products (IP) • Application Related Products (ARP) • Environmental Data Records (EDR) *http://www.class.noaa.gov/
Data Organization • Data Product Granules • A segment of data, with the size optimally determined to achieve maximum efficiency for an algorithm class. • It is associated with an integer number of sensor scans, and its definition varies for sensors and data products • Gaps in granules are filled using a pre-defined ‘missing data’ fill value • Represented as a set of region reference pointers to sections of the respective data set arrays • Data Product Aggregations • A grouping of the same kind of granules packaged in HDF5 covering a temporal range • May contain as few as one granule and as many as an orbit of granules • Represented as a set of object reference pointers to the various groupings of data which make up a particular data product (one for each homogenous dataset included in the granule)
NPOESS Documentation • Documentation for the NPOESS Data Products • NPOESS Common Data Format Control Book – External • Volume I – Overview • Volume II – RDR Formats • Volume III – SDR/TDR Formats • Volume IV – EDR/IP/ARP Formats • Volume V – Metadata • Volume VI – Ancillary Data, Auxiliary Data, Messages, and Reports • Volume VII – Application Packets
HDF5 XML User Block • The XML User Block for NPOESS Data Products provides a ‘quick-look’ into the metadata of the associated HDF5 file • The size of the HDF5 XML User Block will be a multiple of 512 bytes • The XML User Blocks are defined in the following volumes of the CDFCB-X: • Volume V – Metadata • Contains the XML User Block formats for: • Raw Data Records (RDR) • Sensor Data Records (SDR) / Temperature Data Records (TDR) • Intermediate Products (IP) • Application Related Products (ARP) • Environmental Data Records (EDR) • Volume VI – Ancillary, Auxiliary, Reports, and Messages • Contains the XML User Block formats for the Ancillary and Auxiliary data files that are delivered in HDF5 • Example elements: • Mission, Platform, and Instrument Names • Number_of_Data_Products • CollectionShortName(s) • Aggregation Information • Timestamps
NPOESS HDF5 Metadata Locations • The NPOESS HDF5 Metadata is organized hierarchically, from the top down in order to reduce duplication of information and to take advantage of the hierarchical nature of HDF5 • Root Group • Data Products Group • Data Product (indicated by the specific product’s identifier) • Product Aggregation Dataset • Product Granule Dataset
NPOESS Quality Flags Overview • The concept is to provide for consistently stored, high density, quality information about the delivered data – simplifying usability while maintaining storage efficiency • Quality flags are qualifications of one or more consecutive bits in each byte. • Quality flag arrays follow the structure of the data product • The size of the arrays are equal to or less than the size of the data to which the quality information applies (dimensions correspond to the data product arrays) • Quality flags are stored in the HDF5 files as n number(s) of two or three dimensional, 1-byte arrays. • The number of arrays is dependant on the quality flag definitions, specific to each data product • Each byte may contain multiple bit-level flags • Quality flags will be ordered such that each flag is entirely contained within a single byte, occasionally resulting in a byte with reserved or meaningless bits • Byte alignment is the same for every quality flag array • First bit (left-most) is the LSB
NPOESS Sample DataReading the NPOESS HDF5 file with the HDF API
VIIRS Ice Surface Temperature (IST) Environmental Data Record (EDR) Example UML Model
The NPOESS Granule - Product ProfileIce Surface Temperature • The Product Profile describes the NPOESS granule. • For Ice Surface Temperature, the fields in the granule are: • IST_Array (Shown below) • QF1_VIIRSISTEDR (Shown below) • QF2_VIIRSISTEDR • QF3_VIIRSISTEDR • ISTFactors (Scale & Offset – Shown below)
VIIRS Ice Surface Temperature (IST) EDR – HDFView Screenshot
The NPOESS Granule – HDF View The granule dataset array “VIIRS-IST-EDR_Gran_1” contains object IDs that “point” or dereference to the second region of each dataset array under the “VIIRS-IST-EDR_All” group: The first object ID in the VIIRS-IST-EDR_Gran_1 array dereferences to the middle portion of the IST_Array All of these “portions” share the same time effectivity and other granule level metadata.
NPOESS HDF5 Files Summary • The NPOESS Program delivers the official deliverable data products (RDR, SDR/TDR, EDR/ARP/IP) and dynamic ancillary data and auxiliary data in HDF5 Files • The HDF5 Files have an XML User Block that can be accessed without HDF5 tools - provides a “quick-look” into the metadata before opening the HDF5 file • Metadata within the HDF5 files are stored as attributes • There are general UML Models for the NPOESS official delivered data that provide a common framework • Official deliverable data products are organized by reference objects (aggregations) which contain one or more reference regions (granules) • Although data may be accessed directly through the All Data group, the Data Products group provides integrated access: • Allows the user to access both metadata and data through a common HDF5 group • Metadata is accessed directly by reading the Attribute values • Datasets may be accessed by dereferencing the object ID stored in the Data Products Group for the aggregation or granule • NPOESS HDF5 files provide flexibility for a variety of end users.
NPOESS Granules – Derefencing to DatasetsDetails (See the HDF5 User’s Guide release 1.6.5, Chapter 2, “The HDF5 Library and Programming Model” Section 2, “Dataspace Function Summaries” - H5S commands) Note that the H5S API commands fall into two broad categories: • Dataspace Management & Query Functions • These functions operate on the entire dataspace • Entire dataspace is equivalent to an entire (temporal) aggregated array’s dataspace in an NPOESS HDF5 file under the “All_Data” group • Example: H5Sget_simple_extent_npoints • Returns the number of elements in the entire Array under “All_Data” for HDF5 NPOESS. • For VIIRS-IST-EDR_Gran_1, the first reference in the array (referencing the IST_Array) would return 768 x 3200 = 2,457,600 points. • Dataspace Selection Functions – hyperslabs and points • These functions operate on a hyperslab or a point selection • For NPOESS HDF5 files, the “selection” is equivalent to the granule (hyperslab) for a particular field (array) • The “selection” is the portion of the data array the reference “points” to: • Example: H5Sget_select_npoints • Determines the number of points in a dataspace selection. • For HDF5 NPOESS, this would be the number of points in a granule for a particular field • For VIIRS-IST-EDR_Gran_1, the first reference in the array (referencing the IST_Array) would return 256 x 3200 = 819,200 points. • Note that the “select” in the API command is short for “selection”. It is not a redundant term for “get”.
Extract from HDF5 User’s Guide (1.6.5), Section 4.2 - The Programming Model Reading and Writing a Portion of a Dataset A “selection” may be: • A hyperslab (NPOESS uses this only) • A Union of hyperslabs • A list of independent points. • Note: These illustrations show a mapping procedure to another dataspace. The HDF5 API does not do this when you dereference ... this would be user defined.
h5dump Screenshot – VIIRS Sea Surface Temperature HDF5 File • Another way to view the arrays of references (Aggregation and Granule dataset arrays) is with the h5dump utility: • Granule: • Aggregation: • Note: Currently, the only way to match the object ID in the granule/aggregation datasets is to manually list the aggregation as shown above using h5dump or look up the order in the NPOESS Data Format Control Book - External. The HDF Group will add the ability to obtain the name of the dataset a reference points to in v1.8 beta.
Sample Files & HDF5 Reference API Summary • NPOESS granules are made up of portions of one or more dataset arrays. • In order to access a granule, the granule dataset must be read and each object ID dereferenced using the HDF Reference API (H5R) • Use H5Sget_ ... commands to retrieve information about the entire dataspace of the array containing a reference’s selection (or hyperslab) • Use H5Sget_select_ ... command to retrieve information about the selection only