160 likes | 292 Views
CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working Group Meeting September 27-28, 2010
E N D
CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working Group Meeting September 27-28, 2010 ORNL research was sponsored by the U.S. Department of Energy and performed at Oak Ridge National Laboratory (ORNL). ORNL is managed by UT-Battelle, LLC, for the U.S. Department of Energy under contract DE-AC05-00OR22725.
Overview • Data Support for SPRUCE • Data Management Planning • Goals outlined in the Science Plan • Requirements identified in the Data Policy • Actions and resources needed to meet requirements are in the Data Management Plan • Implementation • SPRUCE web site • Resources and products accessible on the web site • Data Support for NGEE • Data Management Planning • Expect planning to be similar to SPRUCE • NGEE Web Site • Shared Development Effort for Acquisition and Processing of Sensor Data
Science Plan for the Climate Change Response Scientific Focus Area 3.11 Data and informatics Goals for Response SFA data management are to ensure the fidelity and accessibility of the SFA data, minimize the amount of time research personnel need to spend on data management activities while achieving high quality data and metadata, and ensure that the data and metadatacan be located and used by project personnel (initially) and the broader scientific community. The suite of activities that collectively comprise this component of the SFA will naturally evolve over the life of the SFA, and they will be done in collaboration with data management components of other Climate SFAs. Initial data management work will focus on defining the data collection and distribution requirements, identifying key leverage points across SFAs and other projects, ensuring that site characterization data is maintained, and resolving any critical informatics knowledge gaps identified in the requirements definition. As the experiments begin to collect high resolution data, the data management activities will shift to ensuring that the experimental data are properly archived and distributed according to the SFA’s data access policy. Data from the Response SFA will be a combination of observational data recorded by researchers and data collected by automated equipment. Further details can be found in Annex C. The data management component will leverage the expertise and tools in the Environmental Data Science and Systems (EDSS) group, particularly the Carbon Dioxide Information and Analysis Center (CDIAC) and the Atmospheric Radiation Measurement (ARM) program archive, to ensure that both observational and automated data are robustly archived in relational data models with necessary timestamp, spatial, temporal, and provenance metadata. • Goals for SPRUCE Data Management • Ensure the fidelity of and accessibility of SPRUCE data to the participants to facilitate all the pertinent science questions; • Minimize the amount of time research personnel need to spend on data management activities while achieving high quality data and metadata; and • Ensure that the data and metadata can be located and used by project personnel (initially) and the broader scientific community and public when appropriate quality checked data are available. • Approach to Data Management Planning • Provide a structured framework to capture the project-defined requirements • Provide data management guidance and best practices • Responsibility of ORNL SPRUCE research group, the Task Leaders in particular, and Forest Service Staff, to reach a consensus about what needs to be controlled, to provide processing details, and to establish who is responsible for implementation. Accountability is key. • Planning Considerations • The plan supports field sampling, measurements, monitoring, and analyses. • Data management information collected pre-experiment will inform the final experimental data management processes. • SPRUCE tasks are subject to change or modification and experimental technology will evolve. The data management plan will have to be flexible and updated as needed, with version control.
Data Management Requirements are identified in the Data Policy Version 1.2 2010/05/10 SPRUCE Data Policy: Archiving, Sharing, and Fair-Use The open sharing of all SPRUCE experiment data among researchers, the broader scientific community, and the public is critical to advancing the mission of DOE’s Program of Terrestrial Ecosystem Science. SPRUCE is implementing an experimental platform for the long-term testing of the mechanisms controlling the vulnerability of organisms, ecosystems, and ecosystem functions to increases in temperature and exposure to elevated CO2 treatments within the northern peatland high-carbon ecosystem. All data collected at the SPRUCE facility, all results of any analysis or synthesis of information, and all model algorithms and codes developed in support of SPRUCE will be submitted to the SPRUCE Data Archive in a timely manner such that data will be available for use by SPRUCE researchers and, following publication, the public. This policy is applicable to all SPRUCE participants including the SPRUCE Research Group at the Oak Ridge National Laboratory (ORNL), the U.S. Forest Service, cooperating independent researchers, and to the users of SPRUCE data products (see the Data Fair-Use Statement). SPRUCE data policies are consistent with the sponsoring U.S. DOE Program for Terrestrial Ecosystem Science Data Policy and with the Memorandum of Understanding between the U.S. Forest Service and UT-Battelle.
Data Policy, continued • Data Archiving and Discovery • Archive at Carbon Dioxide Information Analysis Center (CDIAC) • Two levels of data accessibility. • First is for sharing recently collected, derived, and processed data products among SPRUCE participants. • Second is for access to mature data products by the broader scientific community and public. • Public access will be concurrent with open literature or web site publication of SPRUCE results. • Discovery facilitated through the compilation of descriptive companion metadata records and their inclusion in searchable metadata databases and clearinghouses.
Data Policy, continued • Data Sharing • Timeliness of Data Availability • Researchers will actively process, quality assure, and document environmental measurements, etc • Task Leaders will define a schedule for submitting data to the Archive for their given measurements. • Suggested guidelines for submitting data to the Archive for sharing among SPRUCE participants. • Environmental measurements (automated instruments) -- 30 days after the completion of a month of measurements • Annual surveys and seasonal measurement efforts -- 120 days from the completion of the survey • Laboratory analyses of vegetation nutrient concentrations -- 60 days from completion of analyses • Suggested guidelines for submitting data to the Archive for public access. • Environmental measurements (automated instruments) -- annual updates • Annual surveys and seasonal measurement efforts -- With publication of papers. • Laboratory analyses of vegetation nutrient concentrations -- With publication of papers. • Quality Assurance of Data • Task Leader will define the quality assurance checks to be performed prior to data sharing • among SPRUCE participants (Quality Level 1) and • (Quality Level 2) prior to public access • Suggested guidelines for defining data Quality Levels: Level 1 and Level 2
Data Policy, continued • Data Fair-Use Statement • The SPRUCE data provided on the public archive are freely available and were furnished by the SPRUCE Research Group at ORNL, U.S. Forest Service, and cooperating independent researchers who encourage their use. • Please inform SPRUCE scientist(s) of your use of the archived data and of any publications. • Check the Archive frequently to ensure that you are using the latest version of the data. • Please acknowledge(1) data products as a citation as provided in the data archive documentation, (2) web site information downloads as a bibliographic web citation, or (3) general SPRUCE information as an acknowledgment or personal communication if no other citation form is applicable. • When publishing original analyses and results using these data, please acknowledge the agency or organization that supported the collection of the original data. • Please include these terms as publication keywords as applicable: SPRUCE Experiment, ORNL, U.S. DOE Office of Science, Marcell Experimental Forest, Northern Research Station, U.S. Forest Service. • Please provide an electronic reprint of your independent work to the SPRUCE Project so that all publications can be tracked by CDIAC. Disclaimer of Liability
Data Processing • Data Entry, Transfer, and Transformation • Managing Hardcopy Format Project Records • Managing Electronic Format Project Records • Names and Reporting Formats for Data Files • Scripted Programs for Processing and Analysis • Quality Level of Data • Data Documentation and Archiving • Planning to Archive Data for Public Release • Creating Archive Documentation • Providing Metadata to Searchable Indexes and Clearinghouses • Assigning Descriptive Data Set Titles • Data Systems Management • Day-to-Day Operation of Data Management Systems • Data Management System and Software Configuration Control Guidelines • Actions and resources needed to meet requirements are in the Data Management Plan • Organization • Data Policy • Data Flow • Project Name Information • Identifying Measurement and Sampling Sites • Data and Metadata Reporting • Reporting Sampling and Measurement Dates and Times • Identifying Descriptive Field Variables, Biological Measurements, Chemical and Physical Variables • Reporting Units for Chemical, Physical, and Descriptive Variables • Reporting Values below Detection Limits • Reporting Missing Data • Reporting Uncertainty Estimates • Reporting Conventions for Meteorological Data, and Temperature and Pressure Conditions • Assigning Project-Specific Data Quality Flags
SPRUCE Data Flow • SPRUCE • Web Site • Project and Public Access to Data and Resources • Project Data Access • 100% open for Project Team • Permission needed by others • Project Resources • Common reference sources • Metadata Content Editor • Public Data Archive • 100% open to Public • Data and Metadata Search • Relational Database (e.g., FACE) ? Destination Access Sources Processing/QA Frequency • Task: Environmental Measurements • Automated Instruments 30-60 days after collection SPRUCE Data Archive (CDIAC) • Task R2: Plant growth phenology and NPP • Periodic Observations 120 days after survey, 60 days after sample analyses • Task R3: Community composition • Periodic Observations • Task R4: Plant Physiology • Periodic Observations Project Data Sharing • Task R5: Biogeochemical cycling responses • Periodic Observations Timing ? with publication or per schedule • Task R6: Modeling of terrestrial ecosystem responses to temperature and CO2 • Inputs and Outputs ? Public Data Sharing 30-60 days after collection • Supplemental Information • Photos, Videos, Additional ? • Existing/Historical Data • MEF, NADP, Remote Sensing • Ground penetrating radar assessments • Additional links to existing data ? • Selected data uploaded • Periodic updates with new data and products Compiled by Les Hook, 2010/05/10
Shared Development Effort for Acquisition and Processing of Sensor Data SPRUCE Sensors and data loggers • Next for SPRUCE and NGEE • Number of sensors 25X • Need advanced automated processing, displays, and alarms • Web accessible • Other needs? Acquisition and evaluation software Independent processing steps
Shared Development Effort for Acquisition and Processing of Sensor Data • Next Steps: • Purchasing Campbell Scientific (CS) software with more capabilities. • Meeting with CS Technical Representative for planning guidance. • Making connections with ORNL CS power users. • Learn from SPRUCE and NGEE prototypes • Starting to look beyond acquisition and processing to analysis.
Additional Data Flow Diagrams • Overview of Task Inputs and Resources • S1 Bog Vegetation Survey Task
Overview of Task Inputs and Resources Data Policy Resources Task-Specific Inputs • SPRUCE Web Site • Project Access to Data and Resources • Project Resources • Common references: • SPRUCE Task Description template • SPRUCE Variable Name template • SPRUCE Project Names template • Site Information template • Data Collection Guides • Project Data Archive • 100% open for Project Team • Permission needed by others Data Flow • Task Information • Task Description • ID Measurements • Field Sampling & Measurement Description • Laboratory Analysis Description • Data Processing • Archive Schedule • QA Level Defined • Task Metadata • Task Data Task EM: Task R2: Task R3: Task R4: Task R5: Task R6: SPRUCE Data Archive (CDIAC) Supplemental Information Existing/Historical Data Project Data Sharing Compiled by Les Hook, 2010/05/10
Project Master List of Site Information S1 Bog Vegetation Survey Task >>> Data Management Planning Task-Specific Inputs SPRUCE Web Site Project and Public Access to Data and Resources • Forest Service • Survey Plot Coordinates See DCG – Site Information • Organization • Data Policy • Data Flow • Project Name Information • Identifying Measurement and Sampling Sites • Data and Metadata Reporting • Reporting Sampling and Measurement Dates and Times • Identifying Descriptive Field Variables, Biological Measurements, Chemical and Physical Variables • Reporting Units for Chemical, Physical, and Descriptive Variables • Reporting Values below Detection Limits • Reporting Missing Data • Reporting Uncertainty Estimates • Reporting Conventions for Meteorological Data, and Temperature and Pressure Conditions • Assigning Project-Specific Data Quality Flags See DCG – Task Plan • Data Processing • Data Entry, Transfer, and Transformation • Managing Hardcopy Format Project Records • Managing Electronic Format Project Records • Names and Reporting Formats for Data Files • Scripted Programs for Processing and Analysis • Quality Level of Data See DCG – Hardcopy Forms • Task Metadata • Task Description • Field Sampling & Measurement Description • Laboratory Analysis Description • QA Level Defined • Archive Schedule • Data Documentation and Archiving • Planning to Archive Data for Public Release • Creating Archive Documentation • Providing Metadata to Searchable Indexes and Clearinghouses • Assigning Descriptive Data Set Titles Data and Metadata Compilation SPRUCE Data Archive (CDIAC) • Data Systems Management • Day-to-Day Operation of Data Management Systems • Data Management System and Software Configuration Control Guidelines Project Data Sharing Compiled by Les Hook, 2010/04/30, updated 2010/09/20