280 likes | 298 Views
This profile provides an overview of the general file structure of HDF-EOS5 files, including the core metadata, archive metadata, and struct metadata. It also explains the structure and characteristics of different data structures such as Swath, Grid, Point, ZA, and Profile.
E N D
Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008
General HDF-EOS5 FileStructure • HDF-EOS5 file: any valid HDF5 file that contains a family of global attributes called: coremetadata.X Optional data objects: • family of global attributes called: archivemetadata.X • any number of Swath, Grid, Point, ZA, and Profile data structures. • another family of global attributes: StructMetadata.X
General HDF-EOS5 FileStructure • Global Attributes provide: - Info on the structure of HDF-EOS5 file - Info on the data granule that file contains • Other optional user-added global attributes: “PGEVersion”, “OrbitNumber”, etc. are written as HDF5 attributes into a group called “FILE ATTRIBUTES”
General HDF-EOS5 FileStructure • coremetadata.X Used to populate searchable database tables within the ECS archives. Data users use this information to locate particular HDF-EOS5 data granules. • archivemetadata.X Not searchable. Contains whatever information the file creator considers useful to be in the file, but which will not be directly accessible by ECS databases. • StructMetadata.X Describes contents and structure of HDF-EOS file. e.g. dimensions, compression methods, geolocation, projection information, etc. that are associated with the data itself. S
General HDF-EOS5 FileStructure • An HDF-EOS5 file • can contain any number of Grid, Point, Swath, ZonalAverage, and Profile data structures • has no size limits. • A file containing 1000's of objects could cause program execution slow-downs • can be hybrid, containing plain HDF5 objects for special purposes. • HDF5 objects must be accessed by the HDF5 library and not by HDFEOS5 extensions. • will require more knowledge of file contents on the part of an applications developer or data user.
Swath Structure • Data which is organized by time, or other track parameter. • Spacing can be irregular. • Structure • Geolocation information stored explicitly in Geolocation Field (2-D array) • Data stored in 2-D or 3-D arrays • Time stored in 1-D or 2-D array, • Geolocation/science data connected by structural metadata
Swath Structure • For a typical satellite swath, an instrument takes a series of scans perpendicular to the ground track of the satellite as it moves along that ground track • Or a sensor measures a vertical profile, instead of scanning across the ground track
Swath Structure Data Field.1 Profile Field.1 Profile Field.n HDF5 Attribute Each Data Field object can have Attributes and/or Dimension Scales HDF5 Dataset “SWATHS” group • Swath_X groups are created when swaths are created • Data/Geo fields’ parent group are created when fields are defined. • Swath attributes are set as Object Attributes. • Attributes for Data, Profile, or Gelocation Fields groups are set as Group Attributes • Dataset related attributesset for each data field or geolocation field are called Local Attributes. They may contain attributes such as fillvalue, units, etc. “Swath_1” “Swath_N” Object Attribute <SwathName>: <AttrName> Geolocation Fields Profile Fields Data Fields Group Attribute <DataFields>: <AttrName> Data Field.n Longitude Latitude Local Attribute <FieldName>: <AttrName> Colatitude Time HDF5 Group
Swath Structure • Geolocation Fields • Geolocation fields allow the Swath to be accurately tied to particular points on the Earth’s surface. • At least a time field (“Time”) or a latitude/longitude field pair (“Latitude” and “Longitude”). “Colatitude” may be substituted for “Latitude.” • Other geofields such as “Altitude” can be defined and mapped onto a dataDim • Fields must be either one- or two-dimensional • The “Time” field is always in TAI format (International Atomic Time) * DD = Decimal Degree
Swath Structure • Data Fields • Fields may have up to 8 dimensions. • For multi-dimensional fields: The dimension representing the “along track” must precede the dimension representing the scan or profile (in C-order). ( e.g. “Bands, DataTrack, DataXtrack” )
Swath Structure • Compression is selectable at the field level. • All HDF5-supported compression methods are available through the HDF-EOS5 library • The compression method is stored within the file. • Subsequent use of the library will un-compress the file. • As in HDF5 the data needs to be chunked before the compression is applied. • Field names: • may be up to 64 characters in length. • Any character can be used with the exception of, ",", ";", and "/". • are case sensitive. • must be unique within a particular Swath structure.
Compression Codes For Compression the data storage must be CHUNKED first
Compression Codes For Compression the data storage must be CHUNKED first
Swath Structure A “Normal” Dimension Map • Dimension maps: • Glue that holds the SWATH together. • Define the relationship between data fields and geolocation fields dimensions • Can be normal or indexed mapping A “Backwards” Dimension Map
Grid Structure • Usage - Data which is organized by regular geographic spacing, specified by projection parameters. • Structure • Any number of 2-D to 8-D data arrays per structure • Geolocation information contained in projection formula, coupled by structural metadata. • Any number of Grid structures per file allowed.
Grid Structure • A grid contains: - grid corner locations - a set of projection equations (or references to them) along with their relevant parameters. • The equations and parameters are used to compute the lon/lat for any point in the grid. • Important features of Grid data set: - the data fields - the dimensions - the projection A Data Field in a Mercator-Projected Grid A Data Field in an Interrupted Goode’s Homolosine-Projected Grid
Grid Structure • Data Field characteristics: • Fields may have up to 8 dims • Dim order in field definitions: • - C: “Band, YDim, XDim” • - Fortran: “XDim, YDim, Band” • Compression is selectable at the field level within a Grid. Subsequent use of the library will un-compress the file. Data needs to be tiled before the compression is applied. • Field names must be unique within a particular Grid structure and are case sensitive. They may be up to 64 characters in length. • Any character can be used with the exception of, ",", ";", " and "/".
Grid Structure • Fields are Two - eight dimensional many fields will need not more than three: the predefined dimensions “XDim” and “YDim” and a third dimension for depth, height, or band. • Dimensions: • Two predefined dimensions for Data Fields: “XDim” and “YDim”. • - defined when the grid is created • - stored in the structure metadata. • - relate data fields to each other and to the geolocation information
Grid Structure • Projection: • Is the heart of the Grid structure. • Provides a convenient way to encode geolocation information as a set of mathematical equations, capable of transforming Earth coordinates (lat/long) to X-Y coordinates on a sheet of paper • General Coordinate Transformation Package (GCTP) library contains all projection related conversions and calculations. • Supported projections: * Sinusoidal is pseudocylinderical
Point Structure • Data is specified temporally and/or spatially, but with no particular organization • Structure • Tables used to store science data at a particular Lat/Long/Height • Up to eight levels of data allowed. Structural metadata specifies relationship between levels.
Point Structure • Made up of a series of data records taken at [possibly] irregular time intervals and at scattered geographic locations • Loosely organized form of geolocated data supported by HDF-EOS • Level are linked by a common field name called LinkField • Usually shared info is stored in Parent level, while data values stored in Child level • The values for theLinkFiled in the Parent level must be unique
Point Structure • Point structure groups are created when user creates “Point_1”, ….. • Data and Linkage groups are created automatically when the level is defined • The order in which the levels are defined determines the (0-based) level index • FWDPOINTER Linkage will not be set (acutally first one is set to (-1,-1)) if the records in Child level is not monotonic in LinkFiekd • A level can contain any number of fields and records “POINTS” Group “Point_n” “Point_1” Object Attribute <SwathName>: <AttrName> Linkag Data Group Attribute <SwathName>: <AttrName> FWD POINTER Level n Level 1 BCK POINTER Local Attribute <SwathName>: <AttrName> HDF5 Group Level Data
Zonal Average (ZA) Structure • Generalized array structure with no geolocation linkage (basically a swath like structure without geolocation.) • The interface is designed to support data that has not associated with specific geolocation information. • Data can be organized by time or track parameter • Data spacing can be irregular • Structure • Data stored in multidimensional arrays • Time stored in 1-D or 2-D array “ZAS” group “Za_1” “Za_n” Object Attribute <SwathName>: <AttrName> Data Fields Group Attribute <DataFields>: <AttrName> Data Field.n Local Attribute <FieldName>: <AttrName> HDF5 Group
“h5dump” output of a simpleHDF-EOS5 file HDF5 "Grid.he5" { GROUP "/" { GROUP "HDFEOS" { GROUP "ADDITIONAL" { GROUP "FILE_ATTRIBUTES" { } } GROUP "GRIDS" { GROUP "TMGrid" { GROUP "Data Fields" { DATASET "Voltage" { DATATYPE H5T_IEEE_F32BE DATASPACE SIMPLE { ( 5, 7 ) / ( 5, 7 ) } DATA { (0,0): -1.11111,-1.11111,-1.11111,-1.11111,-1.11111, (0,5): -1.11111,-1.11111, ……………………………….. (4,0): -1.11111,-1.11111,-1.11111,-1.11111,-1.11111, (4,5): -1.11111,-1.11111 }
“h5dump” output of a simpleHDF-EOS5 file (cont.) ATTRIBUTE "_FillValue" { DATATYPE H5T_IEEE_F32BE DATASPACE SIMPLE { ( 1 ) / ( 1 ) } DATA { (0): -1.11111 } } } } } } } GROUP "HDFEOS INFORMATION" { ATTRIBUTE "HDFEOSVersion" { DATATYPE H5T_STRING { STRSIZE 32; STRPAD H5T_STR_NULLTERM; CSET H5T_CSET_ASCII; CTYPE H5T_C_S1; }
“h5dump” output of a simpleHDF-EOS5 file (cont.) DATASPACE SCALAR DATA { (0): "HDFEOS_5.1.11" } } DATASET "StructMetadata.0" { DATATYPE H5T_STRING { STRSIZE 32000; STRPAD H5T_STR_NULLTERM; CSET H5T_CSET_ASCII; CTYPE H5T_C_S1; } DATASPACE SCALAR DATA { (0): "GROUP=SwathStructure END_GROUP=SwathStructure GROUP=GridStructure GROUP=GRID_1 GridName="TMGrid" XDim=5 YDim=7
“h5dump” output of a simpleHDF-EOS5 file (cont.) UpperLeftPointMtrs=(4855670.775390,9458558.924830) LowerRightMtrs=(5201746.439830,-10466077.249420) Projection=HE5_GCTP_TM ProjParams=(0,0,0.999600,0,-75000000,0,5000000, 0,0,0,0,0,0) SphereCode=0 GROUP=Dimension OBJECT=Dimension_1 DimensionName="Time" Size=10 END_OBJECT=Dimension_1 OBJECT=Dimension_2 DimensionName="Unlim" Size=-1 END_OBJECT=Dimension_2 END_GROUP=Dimension
“h5dump” output of a simpleHDF-EOS5 file (cont.) GROUP=DataField OBJECT=DataField_1 DataFieldName="Voltage" DataType=H5T_NATIVE_FLOAT DimList=("XDim","YDim") MaxdimList=("XDim","YDim") END_OBJECT=DataField_1 END_GROUP=DataField GROUP=MergedFields END_GROUP=MergedFields END_GROUP=GRID_1 END_GROUP=GridStructure GROUP=PointStructure END_GROUP=PointStructure GROUP=ZaStructure END_GROUP=ZaStructure END " } } } } }