190 likes | 303 Views
Granule Metadata Management. ted.habermann@noaa.gov mark.ohrenschall@noaa.gov. NCDC (MI3), NWS (MIRS), and others in all NOAA Lines. CLASS: File System Rich Inventory. NESDIS Metadata. CoRIS, NMMR, MERMAid, NMFS. NVDS, CLASS, CoRIS, NMMR, MERMAid NOS: Data Explorer NOAA Library
E N D
Granule Metadata Management ted.habermann@noaa.gov mark.ohrenschall@noaa.gov
NCDC (MI3), NWS (MIRS), and others in all NOAA Lines CLASS: File System Rich Inventory NESDIS Metadata CoRIS, NMMR, MERMAid, NMFS NVDS, CLASS, CoRIS, NMMR, MERMAid NOS: Data Explorer NOAA Library NOAAServer(??)
Types of Granule Metadata Granules are the smallest items that can be retrieved from an archive without invoking a subsetting operation. They are usually files. The metadata that describes characteristics of these files is termed Granule Metadata. In NPOESS Land this is called Dynamic Metadata. These metadata tend to be closely related to the data in the granules and many times are not consistent with national or international standards (which is fine). • There are several types of granule metadata: • File Names (Product Type, Date) • File Headers (format information, statistics, quality information) • Descriptive Statistics from parameters in the granule
Present Inventory • Files come to archive and filename metadata is ingested into inventory. • Fileheader metadata is stored and is not available to data discovery system. • Descriptive Statistics are not calculated. • Users need to develop their own data discovery systems.
Rich Inventory • Files come to Archive • Filename and fileheader metadata are added to inventory. • Descriptive Statistics are calculated and added to inventory. • All metadata is available to the data discovery system and users get the data they need without secondary data discovery.
Types of Metadata File headers include constants (format information), systematics (dates) and variables. Parameters include constants (land masks, spatial covariances) and variables (science parameters). In the aerosol case 30% of the grids presently archived with each granule are constants. All of these metadata types can be described as a series of time segments with constant values. The shortest time segment may be one file interval long.
Examples Systematic Header Parameters (dates) Variable Parameters (aerosol thickness mean and standard deviation)
Segment Model Constant (Static) Slow Variation (Quasi-static) Fast Variation (Dynamic) Time (File Number)
Metadata Ingest File raw values sum(x), sum(x ), mean, std, count 2 Create segment yes New value? Add to last segment no
Current Status - Prototype • Prototype Implementations: • NESDIS Sea Surface Temperatures and Atmospheric Aerosols • AMSU-B and ATOVS Vertical Statistics • NOAA Observing Systems • Pathfinder High Resolution Sea Surface Temperature • AVHRR GAC level 1b Orbital Data
Monitoring and Graphics CWOP ENSO GOES GPS Other … NOAA Observing Systems Simple Observing System Characteristics (num, min/max lon) recorded after each update then monitored Regular Automatic Updates
Parameter Series The Parameter Series table holds information about granule metadata values that are constant across multiple granules. This decreases the size of the database significantly (~90%) in some cases and improves performance. Tools now exist to populate the table retroactively or on ingest.
Java API Documentation http://gdsg.ngdc.noaa.gov/projects/richInventory/apidocs/
Tiered Rich Inventory ... ... Level 3 SST Rich Inventory GVI Rich Inventory Aerosol Rich Inventory ... ... Level 2 L1B Rich Inventory
Distributed Ingest • Product generation algorithms write all metadata to inventory directly instead of file headers. • Files are archived somewhere with pointers from Inventory. • Users get the data they need from distributed system without secondary data discovery.