1 / 1

Key Components of a Successful Earth Science Subsetter Architecture

Key Components of a Successful Earth Science Subsetter Architecture. Jennifer Perez, Walter Baskin, & Peter Piatko NASA Langley Research Center, Hampton, VA.

chavi
Download Presentation

Key Components of a Successful Earth Science Subsetter Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Key Components of a Successful Earth Science Subsetter Architecture Jennifer Perez, Walter Baskin, & Peter Piatko NASA Langley Research Center, Hampton, VA The ASDC developed dedicated subsetters for the CALIPSO, CERES, and TES missions leveraging the HDF Group’s JAVA JNI libraries used in the open source HDFView application. These subsetters are deployed on Univa Grid Engine processing nodes and are managed by the Subsetter Workflow Framework. The subsetters have the capability to return subsetted files in NetCDF format. Types of granules subsetted from each data provider: Search and Subset Application Interface The 2013 ASDC Strategic Plan serves as a mission-focused plan with six defined goals, each with supporting objectives and tasks for implementation that emphasize the vision and support the mission and values of the ASDC. Inspection of a CERES ES8 subset result file in the HDFView application Goal #4 The ASDC will continue to foster innovation by actively assessing emerging technologies and their applicability to existing and projected customer needs and requirements in order to mitigate gaps in capability • CALIPSO: HDF4 • CERES: HDF4 • TES: HDF-EOS (HDF5 out) Since the unveiling of a new CALIPSO Search and Subset Application at the 2010 A-Train Symposium by the Atmospheric Science Data Center (ASDC) and CALIPSO science team, Atmospheric Scientists have responded enthusiastically. Congruent to this goal, the template of this subsetter application architecture has since been applied to the distribution of Level 2 Satellite data granules from Clouds and the Earth's Radiant Energy System (CERES) SSF swath datasets and Tropospheric Emission Spectrometer (TES) datasets. This permits science data users to employ new tools to rapidly locate, subset, and order specific dataset parameters tailored to their requirements. The CALIPSO Search Subsetter User Interface automatically updates and displays the number of granules meeting the spatial and temporal constraints as the user changes them. This dynamic feedback provides a very positive user experience. New subset interfaces under development for CERES and TES datasets leverage this functionality. Details of the resulting data granules are displayed on the ‘Confirm Request’ page. Users are able to download a list of granules that meet their search criteria, browse profile plots for each resulting granule, or submit an order to subset the granules based on their spatial-temporal inputs. The CALPSO Science Team provides browse images for their LIDAR data products. These profiles are easily accessed through links under each granule result on the ‘Confirm Request’ page. Node1 Processing node running JAVA HDF Subsetter Metadata Database WebUser Interface Subsetter Workflow Framework Node2 SciFlo-Univa Grid Engine Node2 There are four key components of successful earth science subsetter architecture. These are: Interactive user interface that is tightly integrated with a PostgrSQL-PostGIS metadata database specifically tailored for the Science Product data granules to be subsetted. Scalable workflow framework for scheduling potentially thousands of subset processes across a configurable number of cluster processing nodes. Efficient subset application with high-speed access to archived data granules. Robust Metadata mining capability focused on obtaining high resolution spatial and temporal metadata. Node … Web Server FTP Site , The ASDC subsetters leverage the Common Object Package and use specific methods in the Java HDF and HDF5 JNI Interfaces to directly access lower level functions in the C libraries. (source of diagram: http://www.hdfgroup.org/hdf-java-html/hdf-object/) The Subsetter Framework is a generic framework for subset processing. It uses SciFlo as its workflow engine to drive the processing, and Univa Grid Engine as its resource scheduler, so that the subsetting can be scaled across a set of computational nodes. • (Green= metadata used by new Search and Subset Applications | Red = original metadata used in legacy data access applications) • New HDF subset and file access capabilities recently developed through ASDC’s collaboration with data providers give science data users the ability to quickly subset and mine data from large archived files, and has set the stage to directly stream desired data directly from archived files to a client’s visualization or analysis applications. • Future Work for Improving ASDC’s Subset and Science Data access • Machine-to-Machine subset interfaces • Very high granularity in spatial/temporal metadata • Geospatial plots of subsetted dataset query results • Real-time browse images of dataset query results The original CALIPSO Level 1 LIDAR spatial metadata is defined by a LineString consisting of ten points. The Search and Subset Application uses LineString metadata constructed by approximately 50 points, greatly increasing the accuracy of two dimensional bounding box queries near the poles. Metadata currently provided for one hour CERES Level 2 SSF granules assumed full coverage of the Earth within 20 degrees of the poles and stepped along the granule footprint boundaries at ten degree longitude intervals. A newer metadata mining technique directly detects field of view positions of the observations along the edges of the granule footprint and implements a Douglas-Peucker simplification on the resulting polygon. The updated hourly footprint polygon contains the same number of points as the original metadata polygon, and is more accurate. In the ECS archive system, Level2 Tropospheric Emission Spectrometer (TES) Ozone metadata assumes global coverage for each daily granule. The ASDC is currently working with the TES Science Team on a prototype search and subset application. The metadata database used in this prototype stores the observation location for every data entry in the granule as an array of points. Bounding Box queries for observations over the entire mission consistently return results in less than five seconds. This ability to obtain any observation over the life of the mission within a few seconds is unprecedented .

More Related