390 likes | 603 Views
Advancing Collaborative Connections for Earth System Science (ACCESS) Program. A Data Quality Screening Service for Remote Sensing Data. Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC Robert Wolfe, NASA/GSFC. Contributions from
E N D
Advancing Collaborative Connections for Earth System Science (ACCESS) Program A Data Quality Screening Service for Remote Sensing Data Christopher Lynnes, NASA/GSFC (P.I.) Edward Olsen, NASA/JPL Peter Fox, RPI Bruce Vollmer, NASA/GSFC Robert Wolfe, NASA/GSFC Contributions from R. Strub, T. Hearty, Y-I Won, M. Hegde, S. Zednik, P. West, N. Most, S. Ahmad, C. Praderas, K. Horrocks, I. Tchered, and A. Rezaiyan-Nojani Project Page: http://tw.rpi.edu/web/project/DQSS
Outline Why a Data Quality Screening Service? Making Quality Information Easier to Use via the Data Quality Screening Service (DQSS) DEMO DQSS Under the Hood DQSS Status DQSS Plans
The quality of data can vary considerably Version 5 Level 2 Standard Retrieval Statistics
Quality schemes can be relatively simple… Total Column Precipitable Water Qual_H2O kg/m2 Best Good Do Not Use
…or they can be more complicated Hurricane Ike, viewed by the Atmospheric Infrared Sounder (AIRS) Air Temperatureat 300 mbar PBest : Maximum pressure for which quality value is “Best” in temperature profiles
Quality flags are also sometimes packed together into bytes Snow/ Ice Flag 0=Yes1=No Day / Night Flag 0=Night1=Day Cloud Mask Status Flag 0=Undetermined1=Determined Sunglint Flag 0=Yes1=No Cloud Mask Cloudiness Flag 0=Confident cloudy 1=Probably cloudy 2=Probably clear 3=Confident clear Surface Type Flag 0=Ocean, deep lake/river 1=Coast, shallow lake/river 2=Desert 3=Land Big-endian arrangement for the Cloud_Mask_SDS variable in atmospheric products from Moderate Resolution Imaging Spectroradiometer (MODIS)
Current user scenarios... Repeat for each user • Nominal scenario • Search for and download data • Locate documentation on handling quality • Read & understand documentation on quality • Write custom routine to filter out bad pixels • Equally likely scenario (especially in user communitiesnot familiar with satellite data) • Search for and download data • Assume that quality has a negligible effect
The effect of bad qualitydata is often not negligible Hurricane Ike, 9/10/2008 Total Column Precipitable Water Quality Best Good Do Not Use kg/m2
Neglecting quality may introduce bias (a more subtle effect) AIRS Relative Humidity Comparison against Dropsonde with and without Applying PBest Quality Flag FilteringBoxed data points indicate AIRS RH data with dry bias > 20% From a study by Sun Wong (JPL) on specific humidity in the Atlantic Main Development Region for Tropical Storms
Percent of Biased Data in MODIS Aerosols Over Land Increases as Confidence Flag Decreases *Compliant data are within + 0.05 + 0.2τAeronet Statistics derived from Hyer, E., J. Reid, and J. Zhang, 2010, An over-land aerosol optical depth data set for data assimilation by filtering, correction, and aggregation of MODIS Collection 5 optical depth retrievals, Atmos. Meas. Tech. Discuss., 3, 4091–4167.
Making Quality Information Easier to Use via the Data Quality Screening Service (DQSS).
DQSS Team • P.I.: Christopher Lynnes • Software Implementation: Goddard Earth Sciences Data and Information Services Center • Implementation: Richard Strub • Local Domain Experts (AIRS): Thomas Hearty and Bruce Vollmer • AIRS Domain Expert: Edward Olsen, AIRS/JPL • MODIS Implementation • Implementation: Neal Most, Ali Rezaiyan, Cid Praderas, Karen Horrocks, Ivan Tchered • Domain Experts: Robert Wolfe and Suraiya Ahmad • Semantic Engineering: Tetherless World Constellation @ RPI • Peter Fox, Stephan Zednik, Patrick West
The DQSS filters out bad pixels for the user • Default user scenario • Search for data • Select science team recommendation for quality screening (filtering) • Download screened data • More advanced scenario • Search for data • Select custom quality screening parameters • Download screened data
DQSS replaces bad-quality pixels with fill values Original data array (Total column precipitable water) Mask based on user criteria (Quality level < 2) Good quality data pixels retained Output file has the same format and structure as the input file (except for extra mask and original_data fields)
Visualizations help users see the effect of different quality filters
DQSS can encode the science team recommendations on quality screening • AIRS Level 2 Standard Product • Use only Best for data assimilation uses • Use Best+Good for climatic studies • MODIS Aerosols • Use only VeryGood (highest value) over land • Use Marginal+Good+VeryGood over ocean
Or, users can select their own criteria... Initial settings are based on Science Team recommendation. (Note: “Good” retains retrievals that Good or better). You can choose settings for all parameters at once... ...or variable by variable
DEMO • http://mirador.gsfc.nasa.gov • (Search for ‘AIRX2RET’)
DQSS Flow data file w/ mask screened data file End User data file Screener quality mask ancillary file Masker Quality Ontology Ontology Query user selection
Significant Accomplishments • Release 1.0 of DQSS for AIRS Level 2 Standard Retrieval • Technology Readiness Level (TRL) = 9 (for AIRS L2) • Includes Quality Impact views • Disclosure of Invention (NF 1679) filed • Announced to AIRS Registered Data Product User Community (~770) • Papers and Presentations • Managing Data Quality for Collaborative Science workshop (peer-reviewed paper) • Sounder Science meeting • ESDSWG Poster • A-Train User Workshop (part of GES DISC presentation) • AGU: Ambiguity of Data Quality in Remote Sensing Data • Ontology (v. 2.4) to accommodate both AIRS and MODIS L2
Current Status * http://mirador.gsfc.nasa.gov • DQSS is operational for AIRS L2 Standard Products • DQSS is offered through the Mirador data search interface* at the GES DISC • DQSS has been refactored to work in LAADS/MODAPS environment • 21/2 month delay due to refactoring and (successful) audit of NPR 7150.2 compliance by NASA’s Office of the Chief Engineer • Schedule no longer has room for client-side screening • But maybe...(to be continued)
Metrics • DQSS is included in web access logs sent to EOSDIS Metrics System (EMS) • Tagged with “Protocol” DQSS • EMS -> ESDS bridge implemented • Metrics from EMS: • Users: 50 • Downloads: 9381
Refactored DQSS encapsulates ontology and screening parameters data product DQSS Ontology Service LAADSWebGUI screening options selected options screening parameters (XML) screening parameters (XML) MODAPS Database MODAPS Minions screening parameters (XML) Encapsulation enables reuse in other data centers with diverse environments.
2010 Plans • Adding more products • MODIS L2 Water Vapor – 2/2011 • MODIS L2 Clouds and Aerosols – 5/2011 • Microwave Limb Sounder (MLS) L2 – 6/2011 • Ozone Monitoring Instrument (OMI) L2 – 8/2011 • Integrate with other services • AIRS Variable Subsetting & NetCDF Reformatting – 3/2011 • OPeNDAP – 6/2011 • MODAPS Web Services – 10/11 • Refactor for reusability & sustainability – 9/2011 • Release – 11/2011 • No longer planned: client-side screening • Except...refactored DQSS may make this possible...
Combine DQSS with Other Services • AIRS Subsetting, quality screening and reformatting to netCDF are currently mutually exclusive • Combining them will also raise awareness of quality screening • DQSS will also be more usable to communities more comfortable with netCDF (e.g., modeling community) • OPeNDAP • OPeNDAP, Inc. is working to support back-end calls to REST services • Gateway is based on ACCESS-05 project (“CEOP Satellite Data Server”) • Will allow access to DQSS via analysis and visualization clients (e.g., Panoply, IDV) • May be reused for other back-end REST services
Refactoring for Release • Proposal schedule included refactoring for long-term sustainability • A key element of successful technology infusion • Current refactoring for MODIS environment... • Gives us a head start • May also give us an avenue to client-side capabilities
New Client Side Concept GES DISC End User screening criteria Web Form XML File XML File Ontology Masker dataset-specific instance info data files data files Screener Data Provider
DQSS Recap • Screening satellite data can be difficult and time consuming for users • The Data Quality Screening System provides an easy-to-use service • The result should be: • More attention to quality on users’ part • More accurate handling of quality information… • …With less user effort
ESDSWG Activities • Contributions to all 3 TIWG subgroups • Semantic Web: • Contributed Use Case for Semantic Web tutorial @ ESIP • Participation in ESIP Information Quality Cluster • Services Interoperability and Orchestration • Combining ESIP OpenSearch with servicecasting and datacasting standards • Spearheading ESIP Federated OpenSearch cluster • 5 servers (2 more soon); 4+ clients • Now ESIP Discover cluster (see above) • Processes and Strategies: • Developed Use Cases for Decadal Survey Missions: DESDynI-Lidar, SMAP, and CLARREO