180 likes | 324 Views
OPeNDAP Present and Future. An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher. Rough Outline. Background OPULS (an OPeNDAP-Unidata collaboration) DAP4 (to supersede DAP2) Experimental extensions ( Async access, UGRID subsets)
E N D
OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New DirectionsDave Fulker and James Gallagher
Rough Outline • Background • OPULS (an OPeNDAP-Unidata collaboration) • DAP4 (to supersede DAP2) • Experimental extensions (Async access, UGRID subsets) • Hyrax over Amazon/S3 • Elaboration on server functions • Perhaps binning, masking, a functional language? • Relationship to WPS & other Web services • Hyrax (& WCS) in OWS-9 OPeNDAP, Inc.
Origins • Scientists (ocean fluxes & temps) envisaged use of http for remote data access (1993) • Collaboration with the designer of the JGOFS data system… • Led to Distributed Ocean Data System (DODS) • DODS later was renamed OPeNDAP (to be explained momentarily…) OPeNDAP, Inc.
OPeNDAP Now Is: • An acronym • “Open-source Project for a Network Data Access Protocol” • Often a synonym for “DAP” • A not-for-profit corp. developing/supporting • “DAPx” - a web-services protocol for data access • Deployed by hundreds of data providers internationally • Employed in many analysis packages (MATLAB, e.g.) • Designated a “Community Standard” by NASA • Server & client implementations* of DAP *Note: there are other implementations
Available Software • Free end-user applications that include DAP support: panoply, idv, nco, … • Commercial: IDL, Matlab, ArcGIS • SDKs: The netCDF C and Java libraries; OC; libdap; Java OPeNDAP, PyDAP • Each of these provides its own API and they span C, C++, Java and Python • Data serves: PyDAP, Hyrax, TDS, … OPeNDAP, Inc.
Concept: Clients Get Just the Data They Need, as They Need them • Accessing data via URLs (i.e., URL = dataset) • Appending query strings to subset or run server functions • Getting responses of two (general) types: • Metadata - dataset descriptions & catalogs (textual) • Content - values and metadata (binary or textual) • Using responses in diverse ways, e.g. • MATLAB maps responses to its internal math types • netCDF library allows apps to work as though reading a local file
NOAA grant forOPeNDAP-Unidata Linked Servers (OPULS) • Goal 1: conformance & linkage between OPeNDAP & Unidata DAP-servers, with short-term outcomes: • New data-model & protocol specs: DAP4 • Consistent behaviors of OPeNDAP & Unidata servers • Data-type richness (NetCDF4, HDF5, RDBs) • Extensions (i.e., new server behaviors): • Irregular-mesh subsetting • Asynchronous access • Goal 2: common framework for OPeNDAP & Unidata servers, aiming for an architecture that • Underpins the unique strengths of both • Reduces likelihood of redundant effort
OPULS Progress So Far • Draft of DAP4 data model & protocol specs • Sufficient for the full richness of NetCDF-4 and HDF-5 files (including “Groups,” e.g.) • Progress on rigorous conformance-testing • Successful extensibility experiments • Irregular-mesh (i.e., UGRID) subsetting • Asynchronous access (as may be useful for near-line data storage) • Amazon cloud deployment (more later…)
Other technologies OPULS considered • JSON responses as an alternative to XML • Decided they added too much bulk to the specification and two many requirements for implementers • Could be added in a future version • Can be built using XSLT from DAP4 XML • OpenSearch • Not incorporated into DAP4 for many of the same reasons • The DAP4 metadata response specifically includes support for these OPeNDAP, Inc.
OPULS and Feedback • OPULS is ready for community feedback • Design documents are online • Web site: http://docs.opendap.org/ • The current draft specification is there as well • Many features are already available in C++ and C implementations OPeNDAP, Inc.
Hyrax over Amazon/S3 • Exploits a natural fit between DAP-based services and cloud services • Initial progress already achieved under the OPULS grant • Bears interesting similarities to the challenge of asynchronous data access • May yield a new community of OPeNDAP users OPeNDAP, Inc.
More about clouds… • Hyrax is trivial to run on the Amazon cloud • We are looking at ways to work with data held in S3 • S3 characteristics: • Flat; • Modest response times; • Simple GET/PUT type API OPeNDAP, Inc.
Using S3 • Tried S3 file systems – found them wanting • Not interoperable (hardly surprising, but limiting) • Extra layer to software stack • Now working with XML ‘catalogs’ • XML documents create a faux hierarchy • XML + XSLT HTML (i.e., a ‘free’ web interface) • XML + Hyrax + caching DAP access • The XML is very similar to THREDDS catalogs OPeNDAP, Inc.
Elaboration on Server Functions • Proposition: the future of OPeNDAP may lie in provision of data-proximate (i.e., server-side) functions that: • Deliver precisely defined subsets • Reduce the number of off-target retrievals • I.e., enable querying of complex dataset properties • Remap/transform data to simplify data use, especially multi-source data integration • Effective caching will be required OPeNDAP, Inc.
Server Functions, DAP4 • DAP2 supports functions and functional composition • Currently, DAP4 treats ‘functions’ and a ‘functional language’ as an extension • DAP4 provides more complete support for functions, including metadata responses (DAP2 does not provide this; a gap in the DAP2 specification) • Support for POST OPeNDAP, Inc.
Server Functions, experimentation • UGrid: Unstructured Grid (irregular mesh) subsetting • We have implemented a clone of the GDS server’s syntax for functions • Enables current netCDF-based DAP clients (e.g., ECMF) to use the Ugrid function • Other projects: Multi-instrument inter-calibration OPeNDAP, Inc.
Some Server-Function Ideas • Binning: returns a distribution (as a raster of boolean values on a user-specified grid) of data values satisfying some criteria • Masking: accepts a raster of zero/nonzero values as a query argument, perhaps as a geospatial selection criterion, e.g. • Perhaps some (limited?) form of functional language for very rich capabilities • WPS, et al. OPeNDAP, Inc.
Summary • DAP is based on a domain neutral data model and an expression-based constraint language • While not ‘RESTful’ in the strictest sense, it is a REST design in spirit (DAP predates the term by several years) • OPULS is a collaborative project between OPeNDAP and Unidata that intends to update DAP • We are also running several experimental mini-projects within its context: • Asynchronous access, Unstructured Grid access, Cloud computing and an expanded, function-based, server-side processing system • DAP servers provide a good platform on which to build OGC web services, as described in the following presentation. OPeNDAP, Inc.