160 likes | 319 Views
OPeNDAP-Unidata Development of DAP4 (a Data Access Protocol). Describing Progress and Seeking Input at the ESIP Summer Meeting 2012 by Dave Fulker (OPeNDAP President). Overarching Concept of OPeNDAP ’ s Data Access Protocol (DAP): Clients Get Only Needed Data, When They Need them.
E N D
OPeNDAP-Unidata Development of DAP4 (a Data Access Protocol) • Describing Progress and Seeking Inputat the ESIP Summer Meeting 2012 • by Dave Fulker (OPeNDAP President)
Overarching Concept of OPeNDAP’s Data Access Protocol (DAP): Clients Get Only Needed Data, When They Need them • Accessing data through web services (i.e., URL ≈ dataset) • Appending query strings to invoke server functions, esp. subsetting • Getting responses of 2 major types: • Metadata - dataset descriptions & catalogs (textual) • Content - values and metadata (binary or textual) • Using responses in diverse client contexts, e.g., • MATLAB maps DAP responses directly to its internal math types • DAP libraries (netCDF, e.g.) simplify the programming of apps
Some of DAP Users’Distinguishing Needs • Data often depict (scientific) phenomena where • Geospatial maps are among the useful views • But other views are important as well • Coordinates often are 2-, 3-, 4- & even 5-dimensional • These may include (time-dependent) coordinate-proxies • Users often wish to use data whose source files • Are in a variety of inconvenient formats • With insufficient or obsolete metadata
Present State of DAP • The DAP2 specification (after nearly 2 decades!) has multiple contemporary realizations on servers and clients • Clients include: MATLAB, GRADS, IDL, IDV... • Python apps that employ the PyDAP library • Fortran, C, C++ & Java apps that employ the netCDF library • Servers include: PyDAP, ERDAP... (often with augmented services) • Most widely deployed: TDS (Unidata) & Hyrax (OPeNDAP) • Widely used by data providers and users, including cases where DAP servers provide translations of inconveniently formatted source files
Branching: Hyrax & THREDDS • Multiple implementations of a protocol often is considered a good thing (per IETF, e.g.) • This can be a problem, however, if the implementations embody excessive redundancy or confuse users • Our view: co-existence of TDS (Unidata) & Hyrax (OPeNDAP) reflects some redundancy & creates some inconsistencies for users • Need #1: achieve conformance ⇒ consistency for users • Need #2: more software reuse ⇒ more advancement
NOAA/BAA grant forOPeNDAP-Unidata Linked Servers (OPULS) • Goal 1: OPeNDAP/Unidata conformance & linkage • New data-model/protocol specs (DAP4), with conformance tests & extensibility demos: • Modes of asynchronous access (to near-line data, e.g.) • Server-side subsetting of data on irregular meshes • Goal 2: common software for OPeNDAP & Unidata servers • Work yet to begin...
OPeNDAP Data-Type Philosophy(reflected in DAP2 & now DAP4) • Data model has few data types • For simplified programming & lowered risk of errors • Data types are deliberately domain-neutral • For better trans-domain utility & programmer uptake • But they allow both syntactic & semantic structures/metadata • These Types do in fact support domain needs • NetCDF-like (can represent functions on 4-D domains, e.g.) • Sequences & selections match DBMS sensibilities
Attributes are like variables but with a semantic purpose, making a variable or a group more meaningful. E.g., variables often have an attribute (of type string) named “units.” DAP4 Data Model (simplified)
OPeNDAP Projection Operators Like netCDF, but as a Web service, users may • Skip indices • Limit index ranges • Reduce dimensionality
Other DAP-Related ServericesNote: these were not part of the DAP2 specification... • Many DAP-based servers (from Unidata & OPeNDAP, e.g.) • Accept multiple types of data as inputs • Offer several views of them over the web • Native DAP web services: for DAP-enabled clients • Source format (lossless): netCDF-to-netCDF or HDF4-to-HDF4, e.g. • Alternative web services: html (browser views), XML, WCS, etc. • Town-Hall: what other services should be offered?
Other OPULS Accomplishments • Irregular mesh subsetting • Progress with U WA (Bill Howe) • To be released soon... • Asynchronous access • Preliminary trials... • Cloud-based service provision (with parallelism) • MODIS reprojection (related, but not OPULS funding)
OPULS Process • Transparency • Public documentation updated weekly (just Google OPULS!) • Advisory committee • Jeff de La Beaujardiere, James Frew, Mike Folk, Steve Hankin, Eric Kihn, Rich Signell • Welcoming input (per this town hall)
Town-Hall Questions • What server functions ought to be specified in the DAP4 protocol? • Simple point-wise mathematics • Mathematics on sampled functions • Truly domain-specific functions (involving the datum, e.g.) • Which (other) web-service protocols should be leveraged by DAP servers, & what are the pertinent use cases? • To facilitate open search (exploiting ATOM), e.g. • To facilitate semantic analysis (providing RDF output, e.g.) • Others?
i thank you • OPeNDAP, Inc • http://opendap.org • increasing data’s visibility