E N D
The Unified Access Framework (UAF) Kenneth.Casey@noaa.gov Julie.Bosch@noaa.gov Tina.Chang@noaa.gov Scott.Cross@noaa.gov Roy.Mendelssohn@noaa.gov Steven.C.Hankin@noaa.gov Jordan.Alpert@noaa.gov Jim.Sargent@noaa.gov Ted.Habermann@noaa.gov John.Relph@noaa.gov Bob.Simons@noaa.gov David.Neufeld@noaa.gov Upendra.Dadi@noaa.gov Rich Signell (rsignell@usgs.gov )Phil.Cogbill@noaa.gov Glenn.Rutledge@noaa.gov Mike.Grogan@noaa.gov Jeff.Budai@noaa.gov Steve Hankin (PMEL), Kevin O’Brien (PMEL/JISAO), and the NOAA UAF team Philosophy, progress, and plans DAARWG Meeting, Seattle, Nov. 2011
Review… continuing into new material NOAA/UAF
GEO-IDEa strategy for enterprise-wide integration of NOAA data (i.e. adopting standards and practicesto achieve data interoperability) 2005-06 66 pgs(by “DMIT”) Why is this a hard problem? NOAA/UAF
NOAA-world weather forecast (time critical) fisheries management (regulatory concerns) nautical charting climate, ocean, atmosphere research response and restoration … the list goes on … NOAA/UAF
Different disciplines have different concepts of ‘data’. Each develops solutions that make sense to them. Getting people (and organizations) to change habits is difficult! (and data management has often been an after thought.) NOAA/UAF
The accepted approach:build a “system of systems” Wrap existing systems with loosely coupled, standardized services a Service Oriented Architecture GEO-IDE Con-ops outlines such a plan. How to build it with a largely volunteer team? NOAA/UAF
Tried and true approach … • Generate use cases • Define requirements • Write a Concept of Operations • … and an Implementation Plan • Assemble (volunteer) teams to implement NOT ! NOAA/UAF
An alternative (‘agile’) approach Don't Solve Problems-- Copy Success -- NOAA/UAF
Why ‘agile’ is attractive Because inevitably: • funding is much smaller than needed; • collaborations are more difficult than anticipated; • infrastructure is being built on a background of rapidly evolving technology NOAA/UAF
Why ‘agile’ is attractive Change equals risk. Mitigate risk by following a strategy of incremental change that serves users (increasingly) well at every stage of evolution. infrastructure is being built on a background of rapidly evolving technology “Visualize a decade away. Build things that work today.” NOAA/UAF
Agile Principles(condensed from the ‘Agile Manifesto’) • Working software is the meaningful measure of progress: ensure early, continuous, frequent releases • Emphasize simplicity - maximize the work not done • Build projects around motivated individuals. Give them the support they need and entrust them • The best architectures, requirements, and designs emerge from self-organizing teams • Welcome changing requirements • Scientists (‘business people’) and developers must work together on a daily basis NOAA/UAF
Lemma “Don’t let perfect be the enemy of good” NOAA/UAF
Servicestack: netCDF-CF-DAP-THREDDS (WMS) Applications: Matlab ArcGIS Ferret GrADS IDV Google Earth LAS ERDDAP … Dataformats: netCDF GRIB HDF … Projects: (too many to name) What “success” did UAF chose to copy? Year 1 focused on gridded datasets. Users: (too many to name) NOAA/UAF
Who is providing data this way? • Modelers • AR4&5, GFDL, NCAR, … • Satellite programs • GHRSST, PathFinder, CoastWatch, … • NCEP weather and ocean forecasts • GRIB files served via NOMADS • Coastal (“HF”) radar • A growing list of observations programs • Argo, OceanSites, tide gauges, … • Adoption by OGC is well underway NOAA/UAF
How to reach users? (without downloading files) Through their preferred tools NOAA/UAF
Desktop access in Matlab Model 1: UMASS-ECOM Model 2: UMAINE-POM Data: SST 2008-Sep-08 07:32 NOAA/UAF
Access in ArcGISusing the Environmental Data Connector (EDC) NOAA/UAF
Desktop access in Ferret NOAA/UAF
Desktop access in GrADS NOAA/UAF
UAF home page Instructions for end users: how to access data through their favorite applications UAF is experimenting to see how documentation may be shared by other projects… NOAA/UAF
Under the hood Projects with data to provide • make their data available as netCDF-CF(or other TDS-compatible format) • host a THREDDS / OPeNDAP server Lets look at the tools we have to link projects together … NOAA/UAF
UAF “network topology”a tree defined in THREDDS (XML) NOAA/UAF GEO-IDE/UAF NOAA NOAA Affiliated OAR NMFS NWS NESDIS IOOS National Partners IOOS Regional Partners ESRL OCO PFEG GFDL PMEL NDBC AOML NGDC NODC NAVO AOOS NOMADS GCOOS SCCOOS Coastwatch PACIOOS SECOORA NERACOOS GLOS NANOOS CENCOOS CARICOOS MACOORA
Not so fast, kiddo! • compliance with CF conventions is inconsistent • files commonly are not aggregated into logical datasets • metadata are often in need of enhancement Perspective – this is not an unusual situation: Standards compliance problems are *the norm* Divergent dialects often pile up (e.g. GRIB, BUFR) UAF tools offer a solution … NOAA/UAF
‘NcML’ can be used to repair problems (*) e.g. Improve CF compliance by adding “standard_name” attribute to GRIB data <variable name="vorticity"> <attribute name="standard_name" value="atmosphere_absolute_vorticity“ /></variable> The file, itself, is untouched. The ‘virtual file’ seen through the services conforms to standards. (*) IOServiceprovider modules also important … not discussed here NOAA/UAF
NcML for aggregation e.g. Aggregate three 1-year files of the same (say) model run <aggregation type="joinExisting“dimName="TimeAxis"> <netcdf location=“year1.nc" ncoords=“365"/> <netcdf location=“year2.nc" ncoords=“365"/> <netcdf location=“year3.nc" ncoords=“365"/> </aggregation> A long time series ‘virtual file’ is seen through the services. NOAA/UAF
UAF Team members are helping data managers, person-to-person, to improve the data services from their projects (a gradual, but important process) • in parallel … NOAA/UAF
Developing the UAF Catalog Cleaner(a ‘web crawler’) ‘RAW’ UAF ‘RAW’ catalog UAF ‘CLEAN’ catalog NOAA NOAA NOAA Affiliated NOAA Affiliated IOOS Regional Partners IOOS Regional Partners OAR OAR NMFS NMFS NWS NWS NESDIS NESDIS IOOS National Partners IOOS National Partners ESRL ESRL OCO OCO PFEG PFEG GFDL GFDL PMEL PMEL NDBC NDBC AOML AOML NGDC NGDC NODC NODC NAVO NAVO AOOS AOOS NOMADS NOMADS ‘CLEAN’ GCOOS GCOOS SCCOOS SCCOOS PACIOOS PACIOOS Coastwatch Coastwatch SECOORA SECOORA NERACOOS NERACOOS GLOS GLOS NANOOS NANOOS CENCOOS CENCOOS CARICOOS CARICOOS MACOORA MACOORA NOAA/UAF
The Catalog Cleaner • Crawl the raw catalog • Extracts metadata from the files, themselves, and from THREDDS into a relational database • Process the data base to detect aggregations, etc. • Create new THREDDS XML that is aggregated and metadata-cleaned NOAA/UAF
Services: Usually OPeNDAP. Often WMS. Sometimes more. Optional documentation (beyond what is inside the file) Optional viewers ‘09 Carbon Tracker files from ESRL ‘raw catalog’ snippit NOAA/UAF
‘clean catalog’ • -- same data, but … • augmented with • Uniform services • Uniform viewers • improved metadata uniform viewers uniform services NOAA/UAF
Uniform services Simple interface to get a subset metadata quality assessment ISO-standard metadata NOAA/UAF
Godiva2 uses the WMS map service (as do other GIS apps) NOAA/UAF
Segue to desktop tools Differencing Property-property plots Sections and Hofmullers Google Earth Vector plots Animations Line plots Analyses NOAA/UAF
and ERDDAP provides … NOAA/UAF
.kml REST URL access to data subsetsin several formats(accessible through home-grown scripting of many types) .mat .nc .mat NOAA/UAF
Data Discovery Crawl the clean catalog. Create ISO metadata ISO-standard metadata NOAA/UAF
Agile principle: “Maximize the work not done” Who has already built reasonably mature discovery portals (preferably free) ? • Unidata’s RAMADDA • GI-CAT (able to crawl THREDDS catalog) • Geoportal (able to harvest ISO metadata) NOAA/UAF
Data Discovery using ESRI Geoportal NODC is leading the UAF Geoportal investigations
Data Discovery using ESRI Geoportal Enter Search terms
Data Discovery using ESRI Geoportal Search Results
Data Discovery using ESRI Geoportal Refined Search New Results
Data Discovery using ESRI Geoportal Expand Selection
Data Discovery using ESRI Geoportal Available Services
Data Discovery using ESRI Geoportal Available Services
Data Discovery using ESRI Geoportal View metadata details about the dataset
Data Discovery using ESRI Geoportal Available Services
Data Discovery using ESRI Geoportal Direct link into the THREDDS Data Server
Data Discovery using ESRI Geoportal Available Services