150 likes | 269 Views
Inter-American Workshop on Environmental Data Access. Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA / LBA-ECO Project Office & University of Tennessee 04-March-2004. NASA Field Experiments.
E N D
Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA / LBA-ECO Project Office & University of Tennessee 04-March-2004
NASA Field Experiments • Over the last 2 decades, NASA has been funding field experiments for ground-truthing satellite observations • FIFE – Grasslands/prairie of midwestern U.S. - Kansas • BOREAS – Boreal forests of Canada • Safari 2000 – South Africa • LBA-ECO – LBA project led by Brazil • Data from these experiments are archived at ORNL DAAC
Data Policy Considerations • NASA Data Policy • Data should be made available to the public as quickly as reasonably possible, allowing for adequate quality assured • Brazilian Law • Data collected in Brazil must remain in Brazil • LBA Data Policy • All data resulting from the LBA study will be archived in Brazil • All LBA data will be made available to the public
LBADATA MATURITY RAW PRELIMINARYYEAR 1 PRELIMINARYYEARS 1..2.. >> FINAL Available to Public PROCESS Before Leaving Brazil FTP TO INPE/CPTEC Register Metadata UPDATE METADATA DOCUMENTATION Complete Complete Complete Complete
Goal: Long-term archive & distribution • The “scientific and technical requirements for long-term preservation and accessibility of environmental data” was a key factor in the system design • ORNL DAAC is part of a network of NASA data archive and distribution centers forming the Earth Observing System (EOS) DIS • As a member of this network, ORNL DAAC must conform to certain EOSDIS protocols (interoperability) • ORNL DAAC has evolved its own data archive standards and recommendations as well, “20-year rule”: accessible, retrievable, usable
NASA funded ORNL to develop a system: Efficient, simple, highly automated Computer platform independent Use Web browser interfaces / software Same system works as data matures from raw, preliminary, to final data PI maintains full control of data visibility Painless user “learning curve” Yet with flexible, comprehensive searching Build a prototype -- KISS
The System Today • 2 browser-based interfaces • LME – The LBA Metadata Editor • Beija-flor – Metadata Search & Data Retrieval System • Uses traditional search engine technology, e.g. Yahoo, Altavista • However, searches from only sources identified as LBA DIS metadata • Metadata • Data
Technical interoperability of LBA DIS Hardware & Software • Metadata • Uses XML (ASCII) code and standard metatag conventions • Uses FGDC metadata standards + LBA-specific fields • Imports from/exports to a DIF • Data • Data formats are not dictated by LBA DIS, though proprietary formats are discouraged • Data files can reside at LBA DIS nodes, other data centers, PI web sites
Data1.txt Data2.xls File.jpg Doc.txt LBA Metadata File – the key to technical interoperability • Metadata_File.xml • ASCII text and contains standard metatags that are accessible to many search engines • Also contains URLs to allow users to link to related data, documentation, and ancillary files, regardless of format Search engine
Semantic interoperability of environmental data across disciplines and languages • English is the language standard • Every LBA-ECO team has a U.S. PI and a Brazilian Co-Investigator • Minimize space science jargon • Beija-flor offers multiple search approaches: • Fielded searches – pick lists provided for values in the metadata • Character string searches accommodate more open-ended queries (and possibly less-expert users) • Map-based / spatial searches and temporal range searches • Combination searches • Browsing
Facilitating interdisciplinary and international access to environmental data resources • Both countries have committed long-term support for the archive and distribution of the LBA data collection: • LBA DIS in Brazil • ORNL DAAC in the U.S. • Global Change Master Directory will include a “DIF” for every LBA data set archived in the U.S. • Links to non-LBA Amazonian-related data are available via Beija-flor • The LBA metadata will be available for indexing by non-LBA search engines and metadata databases
Human factors affecting data availability • Scientists want to hold on to their data as long as they can • Data collected is often part of students’ thesis • Few incentives for scientists to publish their data • Documentation requirements are often prohibitive
Raw PreliminaryYear 1 PreliminaryYears 2+ > 1 2 3 PI ProducesData PI Transfers Data To CPTEC PI Registers/Updates Metadata Brazilian Counterpart LBA Metadata Editor (LME) Transfer DataArchiveatCPTEC Transfer FinalQA’d with Documentation Transfer Metadata are compiledin Data Products, Data Set Descriptions, Papers, Posters 4 The LBA communityand the public can access via Beija-Flor Beija-Flor Search Engine for LBA Data Receive Search
What is Needed for Archiving LBA-ECO Data Sets at the ORNL DAAC? Reference: Best Practices for Archiving Data • Metadata • LBA project parameters (i.e., Beija-flor metadata) must comply with latest GCMD / EOSDIS standards • Data files • Suggested format: tabular data in ASCII, Gridded data in ASCII Grid, Image data in binary or non-proprietary format • Self-describing to identify key entries such as parameter names and units of measure www.daac.ornl.gov/DAAC/PI/info.html
What is Needed for Archiving LBA-ECOData Sets at the ORNL DAAC(continued) 3. Data Set Documentation Documentation should include what a user would need to know about the data 20+ years from now; i.e., the 20-year rule • Data collection goals and description • Description of sample collection sites • Description of measurement methods (e.g., calibration, calculations, software) • Known errors and problems • Description of data file organization • Description of data reporting conventions (e.g., parameter names, units, codes, flags, example data records) • Key information from B-f (e.g., investigator(s), abstract, spatial and temporal attributes, data set citation, references, etc.)