120 likes | 254 Views
Cyberinfrastructure to promote Model - Data Integration . Robert Cook, Yaxing Wei, and Suresh S. Vannan Oak Ridge National Laboratory Presented at the Model-Data Fusion Workshop NACP Albuquerque, NM February 5 , 2013. Data Management Vision .
E N D
Cyberinfrastructure to promote Model - Data Integration Robert Cook, Yaxing Wei, and Suresh S. Vannan Oak Ridge National Laboratory Presented at the Model-Data Fusion Workshop NACP Albuquerque, NM February 5, 2013
Data Management Vision Researchers spend more time doing science and less time doing data management • Free Data Users from the productivity losses associated with • lack of a central clearinghouse, • incompatible formats, units, parameter names • unwieldy file sizes, and large non-aggregated collections • For those making observations, a Data System that enables • Planning, collection, documenting, and quality-assuring data, production of complete and clear metadata, including QA / QC information, and data provenance • For Modelers, a Data System that enables • Discovering, accessing, browsing, and comparing data with standard tools (GIS, visualization/analysis systems) without concern for format, location, or volume • grid/distributed/cloud computing
ORNL DAAC • Part of NASA’s Earth Observing System Data & Information System • Terrestrial Ecology and Biogeochemical Dynamics Data Tools for Discovery, Access, Extraction, and Visualization • Data Holdings (>900 products) • NASA Field Campaigns • Land Validation (remote sensing) • Global and Regional Spatial Data • Terrestrial Biogeochem. Model Code Fire http://daac.ornl.gov
Data Assimilation using Web services • MODIS Web Service Script • Daymet Web Service Script • Programmatically access MODIS and Daymet data without downloading full data files MODIS Tiles Tristan Quaife, University of Exeter • SiB3 • LoTEC • Can_IBIS • ORCHIDEE • LPJwsl • TECO Daymet Tiles Terrestrial Biosphere Models
North American Carbon Program Synthesis Framework • What are the magnitudes and spatial distribution of carbon sources and sinks, and their uncertainties? • What is the spatial pattern and magnitude of interannual variation in carbon fluxes? • Are the various observations and modeling estimates of carbon fluxes consistent with each other - and if not, why?
Net Ecosystem Exchange 2000 - 2006 Goal: Provide data management support for modeling and synthesis activities • Activities: • Coordinate data management activities with NACP modelers and synthesis groups; • Prepare and distribute model input data; • Provide data management support for model outputs; • Provide tools for accessing, subsetting and visualization; • Provide data packages to evaluate model output; and • Support synthesis activities, including data support for workshops. Hayes et al 2012. http://nacp.ornl.gov/index.shtml
Pilot Study: Integrate Observations with Models using “Access Broker” Model-Data Comparison Framework Stefano Nativi et al. Customized Observational Data Request for Data Data Assimilation Framework Data MODIS Web Service SCRIP (regrid) Original MODIS Data Process Original Observational Data Users can access observational data and convert to their specified format, spatial resolution, spatial extent, and temporal extent. FTP/HTTP/…
Data and Information Managementfor Model-Data Integration • Goal is to ensure data, products, information, and tools required to address science questions are available in harmonized forms when needed • Develop data management capability that • Reflects the needs of the user community, • Is created in a reasonable time-frame, and • Is universally accepted as a value-added capability to the those doing work
Environmental Observations and Modeling Observations & Experiments Ecosystem Models Data Center Communication among data managers, those making the measurements / experimentalists, and modelers is critical
Characteristics of the Data System (1) • Dedicated financial support for data management is essential • Close coordination between the data group(s) and the producers (experimentalists) and users (modelers) of the data products • Based on a data management plan and a data policy • Integrated system that delivers a suite of diverse products • Establish standards (file, workflow, network) and promote interoperability • Processes to assure and document data quality to allow proper interpretation and use
Characteristics of the Data System(2) • Facilitate rapid exchange of data, products, and information; rapid exchange of large volume data • Promote the use of best practices to prepare and document data to share and archive • Make efficient use of existing data management infrastructure and resources • Ensure that finalized data and associated documentation are transferred to an appropriate archive • Make numerical models (source code) and description of the models available, along with model parameters and example input and output data (Thornton et al 2005)