190 likes | 307 Views
National Research Council. Third Meeting, Board on Research Data and Information. NOAA Data Stewardship. Scott Hausman Acting Director, NOAA National Climatic Data Center (NCDC), Asheville, North Carolina. June 3, 2010. Overview. Leader in Environmental Data Management
E N D
National Research Council Third Meeting, Board on Research Data and Information NOAA Data Stewardship Scott Hausman Acting Director, NOAA National Climatic Data Center (NCDC), Asheville, North Carolina June 3, 2010
Overview • Leader in Environmental Data Management • Partner with National Research Council • Transforming NOAA Data Management • Strengthening Policies and Directives • Investing in Enterprise IT Infrastructure • Leveraging Universal Standards • Expanding Data Discovery and Access • Redefining Scientific Data Stewardship • Developing a Data Management Workforce • The way forward and how BRDI can help NOAA Data Stewardship, Third NRC/BRDI Meeting
Leader in Environmental Data Management Improving data stewardship is among NOAA’s top priorities! Mission: To understand and predict changes in Earth’s environment and conserve and manage coastal and marine resources to meet our Nation’s economic, social, and environmental needs Vision: An informed society that uses a comprehensive understanding of the role of the oceans, coasts, and atmosphere in the global ecosystem to make the best social and economic decisions Science – Service – Stewardship NOAA Data Stewardship, Third NRC/BRDI Meeting
Leader in Environmental Data Management Broadest Scope of any Agency for Environmental Data Stewardship • ~150 Research & Operational Observing Systems (http://www.nosa.noaa.gov/observing_systems.html) • ~4-5 Petabytes of data/year (~15 Pb total) Atmospheric Observations Land Surface Observations • Data Management Challenges are Changing • No longer about data volume • Data discovery and integration • Data stewardship and information Ocean Observations Space Observations NOAA Data Stewardship, Third NRC/BRDI Meeting
Partner with National Research Council Principles for Effective Environmental Data Management • Data should be archived and accessible • Adequate resources for end-to-end management • Management activities should involve users • Interagency and international partnerships • Metadata are essential • Expert stewards required for management • Process to decide what data to archive • Archive must support discovery, access, and integration • Effective management requires a formal, ongoing planning process 2007 National Research Council Committee on Archiving and Accessing Environmental and Geospatial Data at NOAA NOAA is institutionalizing these principles! NOAA Data Stewardship, Third NRC/BRDI Meeting
Transforming NOAA Data Management • Strengthening Policies and Directives • Investing in Enterprise IT Infrastructure • Leveraging Universal Standards • Expanding Data Discovery and Access • Redefining Scientific Data Stewardship • Developing a Data Management Workforce NOAA Data Stewardship, Third NRC/BRDI Meeting
Strengthening Policies and Directives • Coordinates the development of NOAA’s environmental data management strategy, and policy, and provides guidance to ensure consistent implementation across NOAA, on behalf of the NOSC and CIO Council • Environmental data management is an end-to-end process that includes acquisition, quality control, validation, reprocessing, storage, retrieval, dissemination, and long-term preservation activities • The goal of the EDMC is to enable NOAA to maximize the value of its environmental data assets through sound and coordinated data management practices • Leadership: Chair and Deputy Chair appointed by NOSC and CIO Council • Membership • Line Office Representatives • NOAA Chief Enterprise Architect • NOAA Data Management Architect • Ex-officio or Advisory • NOAA National Data Center Directors • Designated Mission-Goal & Sub-Goal Team Representatives • NOAA liaisons to key Federal and International initiatives concerning environmental data management Environmental Data Management Committee (EDMC) Established in Fall of 2009 Helen Wood current Chair NOAA NOSC NOAA Observing System Council CIOC Chief Information Officer Council EDMC Environmental Data Management Committee DMIT Data Management Integration Team NOAA Data Stewardship, Third NRC/BRDI Meeting
Strengthening Policies and Directives Overarching all Aspects of the Data Management Lifecycle NOAA Environmental Data Management Framework Governance, Requirements Management, Architecture Management Developing and maintaining rich metadata to accompany the data Establishing mechanisms that allow for user requirements and feedback • Planning of New Observing or Data Management Systems • Requirements definition • Analysis of alternatives • Systems design • Integration with observing systems (NOAA, interagency, state, international) • Determining what to archive and associated funding • Buy/build Stewardship Overarches Observing Operations, Archive, Access, Use All ongoing, iterative processes that improve: 1) data and metadata content (include reprocessing data) and 2) access and user understanding • Observing Operations • Actual observation • Transmission/processing QA • Integration with other data to create products (e.g., models) • Dissemination to real-time subscribers • Delivery to archive • Archive • Ingest (Receipt) • Archival storage • Data management (populating catalogs, registries, metadata) • Preservation planning (migration to new technologies) • Access • Discovery (catalogs, registries, metadata) • Dissemination to users (web services, legacy systems, standard formats) • Use • Integration with other information (NOAA, others) • Assimilation into models • Product creation • Make decisions (policy, emergency, others) • Scientific discovery • Feedback to NOAA NOAA Data Stewardship, Third NRC/BRDI Meeting
Strengthening Policies and Directives “What to Archive” Revision of NOAA Administrative Order (NAO) 212-15 Establishes a NOAA policy for acquiring, integrating, managing, disseminating, and archiving environmental and geospatial data and information obtained from worldwide sources to support NOAA's mission. • Maintains NOAA’s policy of “full and open access” to environmental data • Provides mechanism for EDMC to develop procedural directives for more detailed guidance (e.g., the NOAA Procedure for Scientific Records Appraisal and Archive Approval) • Presents an end-to-end lifecycle framework for the management of environmental data • Signed by NOAA CIO; awaiting clearance from Chief Administrative Officer and General Council NOAA Data Stewardship, Third NRC/BRDI Meeting
Investing in Enterprise IT Infrastructure Comprehensive Large Array-data Stewardship System (CLASS) • NOAA’s primary enterprise IT system for archive and access • Employs OAIS-RM • Enterprise benefits include: • Economy of Scale • High Quality of Service • A System Evolution Approach • Location of “Nodes” • Operational: Asheville, NC (NCDC); Boulder, CO (NGDC) • Development and Test: Suitland, MD; Fairmont, WV • Environmental Data Holdings • Current: POES, DMSP, GOES, CFSR (Model Reanalysis) • Development: MetOp, EOS MODIS, NPP NOAA Data Stewardship, Third NRC/BRDI Meeting
Leveraging Universal Standards Open Archival Information System Reference Model (OAIS-RM) • Reference model is to be applicable to all digital archives, and their Producers and Consumers • Identifies a minimum set of responsibilities for an archive to claim it is an OAIS • Establishes common terms and concepts for comparing implementations, but does not specify an implementation • Provides detailed models of both archival functions and archival information • Discusses OAIS information migration and interoperability among OAISs NOAA Data Stewardship, Third NRC/BRDI Meeting
Leveraging Universal Standards Global Earth Observation-Integrated Data Environment (GEO-IDE) Scope – NOAA-wide architecture development to integrate legacy systems and guide development of future NOAA environmental data management systems Vision – NOAA’s GEO-IDE is envisioned as a “system of systems” – a framework that provides effective and efficient integration of NOAA’s many quasi-independent systems Foundation – built upon agreed standards, principles and guidelines Approach – evolution of existing systems into a service-oriented architecture Result – a single system of systems (user perspective) to access the data sets needed to address significant societal questions Unified Access Framework for Gridded Data (UAF Grid) Integrated Ocean Observing System Data Integration Framework (IOOS DIF) NOAA Data Stewardship, Third NRC/BRDI Meeting
Expanding Data Discovery and Access Tiers of Access (Customer Sophistication) Web Services M2M Interfaces Online Requests Subscription Services Portals Cloud Resources Search Engines Source Agnostic Interface / Federated Data Sources for Transparent Access Metadata Catalogs for Data Discovery NOAA National Data Centers NOAA Centers of Data NOAA Other Data Sources External Data Sources NOAA Data Stewardship, Third NRC/BRDI Meeting
Redefining Scientific Data Stewardship New approach for real time data management and production of climate data records Observing System Monitoring Climate Data Records Original Observations & Metadata Scientific Stewardship Teams Random & Time Dependent Error Checks Sentinel Scientific Stewardship Teams Observing System Operators Climate Quality Data Records & Products Reprocessing & Reanalyses Archives Operational Product Processing Random & Time Dependent Error Checks Feedbacks Metadata Intercomparison and Analysis • Rapid feedback to observing system • Scientist/Analysts involved with observations early on • Enable and facilitate future research • Safeguard interests of future generations • End-to-end accountability of data • Spatial and temporal sampling • Time dependent biases • Metadata • Reprocessing for CDRs NOAA Data Stewardship, Third NRC/BRDI Meeting
Developing a Data Management Workforce • NOAA/NESDIS Top Priority • Partnering with Earth Science Information Partners (ESIP) Federation • One day with afternoon practicum • Focus on graduate students and junior scientists • Target Fall AGU Meeting Short Course on Data Stewardship NOAA Data Stewardship, Third NRC/BRDI Meeting
Conclusion • The Way Forward • Translate NAO 212-15 in NOAA Directives • Finalize a NOAA-wide CONOPS for Archive • Prototype federated architecture • How BRDI can help • Defining archival standards for research/small data sets • Improving interdisciplinary integration of data • Increasing transparency and discovery to enhance data reuse and avoid redundant research NOAA Data Stewardship, Third NRC/BRDI Meeting
NOAA Data Stewardship Scott Hausman Acting Director NOAA’s National Climatic Data Center (NCDC) 151 Patton Avenue, Room 557 Asheville, NC 28807-5002 • 828-271-4848 828-271-4246 828-450-9188 Scott.Hausman@noaa.gov www.ncdc.noaa.gov NOAA Data Stewardship, Third NRC/BRDI Meeting
Background Material NOAA Data Stewardship, Third NRC/BRDI Meeting
NOAA Data Management Principles • Commitment and leadership: Information is a strategic asset and information management must be a key component of every environmental data and information program. This ethic must be reflected in a corporate culture, embraced throughout the organization, that recognizes data as a corporate resource. • Stewardship: People who take observations or produce data and information are stewards of these data, not owners. These data must be collected, produced, documented, transmitted and maintained with the accuracy, timeliness and reliability needed to meet the needs of all users. • Long-term preservation: Irreplaceable observations, data products of lasting value, and associated metadata must be preserved. This information must be well-documented and maintained so that it is available to and independently understandable by users, now and in the future. • Requirements-driven: It is essential that providers and users of data and products play an active role in defining the constantly evolving requirements that drive the development and evolution of data management systems. • Discovery and access: Freedom of access, mechanisms that facilitate discovery, timely delivery, use and interpretation of data and products (directories, browse capabilities, metadata, mapping, visualization, etc.) are essential, recognizing relevant policies and regulations. • Standards and practices: Appropriate use of information technologies, widely shared standards, and integration approaches are vital to facilitate collection, management, discovery, dissemination, and access services for environmental data and products. This will ensure interoperability among providers, systems, and users. Effective application of standards and best practices contribute to the development of systems that are interoperable, efficient, reliable, scalable, and adaptable. • Quality: Data, products and information should be of quality sufficient to meet the requirements of society and to support sound decision making. • Cooperation and coordination: Environmental and scientific data management is a task of global scope – a whole that should be much bigger than the sum of its parts. It is only by participating in a global community of integrated data management that each organization can realize the potential of its data to the betterment of humankind. • Security: Data, information, and products must be preserved and protected from unintended or malicious modification, unauthorized use, or inadvertent disclosure. NOAA Data Stewardship, Third NRC/BRDI Meeting