270 likes | 431 Views
Regional to Local IOOS Data Management and Interoperability: Perspective from the Trenches ( NANOOS, US Pacific NW). Emilio Mayorga Applied Physics Lab, University of Washington mayorga@apl.washington.edu. David Jones, APL-UW Rick Blair, Boeing. OceanSciences, Portland, 22 Feb. 2010.
E N D
Regional to Local IOOS Data Management and Interoperability: Perspective from the Trenches ( NANOOS, US Pacific NW) Emilio Mayorga Applied Physics Lab, University of Washington mayorga@apl.washington.edu David Jones, APL-UW Rick Blair, Boeing OceanSciences, Portland, 22 Feb. 2010
This talk: Focus on in-situ data/monitoring and moorings Cruises and other sparse monitoring Northwest Association of Networked Ocean Observing Systems (NANOOS) – an IOOS RA
Outline • Expanding data-management capabilities within NANOOS (sub-regional) • Options available (schemas, systems, best practice) • Xenia adaptation to NANOOS-APL (UW & WA) needs • Roles, Lessons, Issues • 2010 Plans
Data Management “Data management consists of the system (or network of systems) for assembly, storage, registration, dissemination, and permanent archiving of data collections, and of the enumeration and enforcement of standards and specifications regarding data quality and data handling. National efforts to standardize and integrate data management practices will aid in data dissemination and will ultimately advance research, decision-making, and public awareness of Earth observations. For [this] purposes, data management begins after observations ... have been performed and the results transmitted to their initial storage facility.” de La Beaujardiere et al., 2009, Ocean and Coastal Data Management, OceanObs2009
NANOOS Visualization System (NVS): Need NANOOS’ IOOS RA mandate to provide regional integration, applications Near-real-time coastal/ocean monitoring platforms in Pacific NW http://www.nanoos.org/nvs/
NVS 1.0 http://www.nanoos.org/nvs/
Not settling for Mickey Mouse effort … but we’re new and small! • Spunk and ambition • Collaborative NANOOS-APL Regional Data Management Effort • Focus: local-regional WA datasets with unclear stewardship, low-resource providers, low incentive to interoperate. And APL-UW’s own. • New and small effort • Complement existing NANOOS-CMOP data management effort (Oregon) • Close interaction with and support of local partners is a critical component, and critical to national IOOS success
Helping each other succeed at the local-regional scale, with regional-national-international partnerships NANOOS-APL Regional Data Management Effort • Close interaction with and support of local partners is critical to national IOOS success! • And internationally (e.g., Canada, GEOSS)?
Options for Regional Coastal Data Management? Technical options for new shop, to serve regional-local needs? Data schema, procedures, QA/QC protocols, data loading? Best practices? • OceanSITES [NetCDF CF] • UNIDATA CDM [~ NetCDF CF] • OBC-DMO [RDBMS] • CUAHSI HIS ODM [RDBMS] • Study and borrow from the big guys (NDBC, etc) • Make it up from scratch Important gap in IOOS DMAC support, focus (IMHO)
Xenia in-situ Data Schema & System • Jeremy Cothran, SECOORA/Carolinas RCOOS • Fully open, Jeremy and community very supportive • Schema “about right” for an upstart: not too complex, not too simple • Widely documented (if dispersed) • Parts of schema have not been used or documented extensively • Includes components to support visualization applications • Most widely used schema (if ad-hoc) within IOOS RA’s: • RA’s: SECOORA, GLOS (AOOS & GCOOS?) • Other: CarolinasRCOOS, Intelligent River, NWS Marine Weather Portal • Implemented in PostgreSQL and SQLite. Data loaders, etc • Submitted as IOOS DMAC Standard • Virtual Machine implement. (AMI/VMware; see Jeremy, web site) • http://groups.google.com/group/xeniavm
XAN: Xenia APL-NANOOS • Goal: Support robust, long-term management and distribution for data from local-regional providers • Unlike Jeremy’s Xenia usage, NO support for: • Integration of existing robust data streams • Specific user-application needs Addressed by NVS-specific DB • Minimal QA/QC and Methods metadata, initially • PostgreSQL. Python data loaders • Modified schema, conventions (adapted to our needs, datasets) • Changed schema only when necessary (back to Xenia community?) • Extensive depth-profile and cruise data • Extensive but narrower use of “collections” (observation collections) • Conventions: collections, platforms, sensor vs. O&M vs. observation • PostGIS geospatial add-ons • Observation Series Catalog
Platforms & Sensors Observations & Measurements Data Dictionary THE DATA (observations) Organization & Project Metadata Observation Collections: depth profiles, stations, cruises (hierarchical) XAN – Core Schema – Components
XAN – Auto-generated Geospatial and Catalog PostGIS-enabled tables Observation Series Catalog (Adapted from CUAHSI HIS ODM) Summary by unique Platform + Sensor + Location PostGIS-enabled (point) Convenient for GIS access, querying and live summaries • Stations (point) • Casts (point) • Cruises (line) Manually Defined Transects (matched to stations)
XAN Status: Data Holdings Washington State 1998 - present Platforms: 8 Collections: 9,680 soon: 30,000 Observations: 4,150,000 soon: > 27,000,000
Hood Canal XAN Access From GIS Olympic Peninsula Seattle
Current Data Access Mechanisms • NVS • PRISM & HCDOP cruises: Pre-generated data files • Near-real-time platforms: 30-day harvested data XAN > NVS DB • SOS • OOSTethys Perl Server code (from Eric Bridger/GoMOOS) • Will be operational soon • ERDDAP • In test phase • Raw SQL queries • Limited access, not for wide use or the faint of heart • NVS cruise data access (PRISM)
Contrasting Case: CUAHSI HIS Cyberinfrastructure Horsburgh et al. / Environmental Modelling & Software 24 (2009) “Cathedral” model? (Build all components, entire system). OOI??
Lessons, Issues • Xenia served our needs, but: • Needs case-study documentation; more precisely defined schema, usage • Useful to separate core data-management and user-application roles • Foster its development as a best-practice schema, system? • Support for local data management efforts is important. Sustainability? • Sharing solutions & best practices among RA’s, local entities • Flexibility and evolving roles with local partners • Sharing national network knowledge & implementation • What role for regional data assembly and management efforts like RA’s, vs. national-global data ecosystem? • Avoiding duplication and redundant burdens • Accepted long-term archives, re-using data interoperability networks? • e.g. EPA funding requirement to submit data to STORET. What if already provided nationally via IOOS?
Variable Responsibilities per Dataset • Defining our role per dataset, per local provider • From full management responsibilities to just ingested copy for wider distribution & interoperability • Raw vs. processed data • Different roles, management schemes • From source file archive to restructured, readily accessible processed data (RDBMS, NetCDF, etc)
2010 Plans, Part 1 • More datasets • More extensive metadata • QA/QC: Learning from QARTOD/Q2O, BCO-DMO, ACWI, etc • Wider data access mechanisms • OPeNDAP/THREDDS • ERDDAP • GeoServer: OGC WMS & WFS, KML, GeoRSS, Google GeoSearch • Local (UW) GIS desktop applications
2010 Plans, Part 2 • Community of savvy local data users • Work more closely with local partners (county, private industry, academic) • Foster community • Link local datasets to national networks (IOOS, …) • Give back to Xenia community, IOOS RA network
Acknowledgments • NANOOS DMAC partners: CMOP, OSU, Boeing • Data providers: PRISM, HCDOP, ORCA, etc. • Jeremy Cothran (SECOORA) – Xenia • Eric Bridger (GoMOOS/NERACOOS) – Perl SOS server • IOOS • OOSTethys project