150 likes | 294 Views
Experiences Developing a Semantic Representation of Product Quality, Bias, and Uncertainty for a Satellite Data Product. Patrick West 1 , Gregory Leptoukh 2 , Stephan Zednik 1 , Chris Lynnes 2 , Suraiya Ahmad 3 , Jianfu Pan 4 , Peter Fox 1 Tetherless World Constellation
E N D
Experiences Developing a Semantic Representation of Product Quality, Bias, and Uncertainty for a Satellite Data Product • Patrick West1, Gregory Leptoukh2, Stephan Zednik1, Chris Lynnes2, Suraiya Ahmad3, Jianfu Pan4, Peter Fox1 • Tetherless World Constellation • NASA Goddard Space Flight Center • NASA Goddard Space Flight Center/Innovim • NASA Goddard Space Flight Cetner/Adnet Systems, Inc. EGU2011-13502-1
Outline of Presentation • Current Issues and Prior Work • Definitions • Our Approach to resolving these issues • Our Focus Area • Multi-Sensor Data Synergy Advisor (MDSA) • Aerostat • Applying our approach in the focus area • Conclusion • Questions
Issue • climate model and various environmental monitoring and protection applications have begun to increasingly rely on satellite measurements. • research application users seek good quality satellite data, with uncertainties and biases provided for each data point • remote-sensing quality issues are addressed rather inconsistently and differently by different communities.
Problem Space • Graphics, information here on how this relates to MDSA, DQSS, and AeroStat.
Definitions • Product Quality: is a measure of how well we believe a dataset represents the physical quantity that it purports to. As such, it is closely related to (though not identical to) the level of validation of the dataset. It often varies within the dataset, with dependencies on such factors as viewing geometry, surface type (land, ocean, desert, etc.) and cloud fraction. Cf. • Data Quality: Data Quality is typically applied to a particular instance of data (pixel, scan or granule). It describes how well the instrument and retrieval algorithm were able to resolve a result for that instance.
Definitions • Uncertainty: has aspects of accuracy (how accurately the real world situation is assessed, it also includes bias) and precision (down to how many digits). • Bias: has two aspects: • (1) Systematic error resulting in the distortion of measurement data caused by prejudice or faulty measurement technique (GL: modified from IAIDQ site) • (2) A vested interest, or strongly held paradigm or condition that may skew the results of sampling, measuring, or reporting the findings of a quality assessment: • Psychological: for example, when data providers audit their own data, they usually have a bias to overstate its quality. • Sampling: Sampling procedures that result in a sample that is not truly representative of the population sampled. (Larry English)
Approach • semantic differences in quality/bias/uncertainty at the pixel, granule, product, and record levels • outline various factors contributing to uncertainty or error budget; errors introduced by Level 2 to Level 3 and Level 3 to Level 4 processing steps, including gridding, aggregation, merging and analysis algorithm errors (e.g., representation, bias correction, and gap interpolation) • assess needs for quality in different communities, e.g., to understand fitness-for-purpose quality or value of data vs. quality as provided by data providers
Approach • Good Quality Documentation (based on standards and controlled vocabularies) is a necessary step to enabling semi-autonomous resource assessment. • Existing standards are ambiguous and not consistently implemented. (STRONG WORDS, NEED MORE DOCUMENTATION HERE, REFERENCES)
IQ Curator Model • Introduction to it
Conclusion • Quality is very hard to characterize, different groups will focus on different and inconsistent measures of quality. • Products with known Quality (whether good or bad quality) are more valuable than products with unknown Quality. • Known quality helps you correctly assess fitness-for-use • Quality Documentation (Metadata) is a key factor in determining Fitness-for-Purpose
References • Levy, R. C., Leptoukh, G. G., Zubko, V., Gopalan, A., Kahn, R., & Remer, L. A. (2009). A critical look at deriving monthly aerosol optical depth from satellite data. IEEE Trans. Geosci. Remote Sens., 47, 2942-2956. • Zednik, S., Fox, P., & McGuinness, D. (2010). System Transparency, or How I Learned to Worry about Meaning and Love Provenance! 3rd International Provenance and Annotation Workshop, Troy, NY. • P. Missier, S. Embury, M .Greenwood, A. Preece, and B. Jin. Quality views: capturing and exploiting the user perspective on data quality. Procs VLDB, 2006. (PDF) http://users.cs.cf.ac.uk/A.D.Preece/qurator/resources/qurator_vldb2006.pdf
Thank You • Questions? • Contact Information: • AeroStat Project Pages: http://tw.rpi.edu/web/project/AeroStat • MDSA Project Pages: http://tw.rpi.edu/web/project/MDSA