400 likes | 524 Views
The Open source Project for a Network Data Access Protocol (OPeNDAP): A Discipline Neutral Approach to Interoperability on the Internet Presented at the Fall AGU Meeting in San Francisco, CA 9 December 2003 Peter Cornillon and Dan Holloway Graduate School of Oceanography University of Rhode
E N D
The Open source Project for a Network Data Access Protocol (OPeNDAP): A Discipline Neutral Approach to Interoperability on the Internet Presented at theFall AGU Meeting in San Francisco, CA9 December 2003Peter Cornillon and Dan HollowayGraduate School of OceanographyUniversity of Rhode and James Gallagher The Open source Project for a Network Data Access Protocol (OPeNDAP)
Outline • Trends in data system development • Data system elements • OPeNDAP • A data access element • In NVODS – an example • Status • Conclusions
Trend in Data System Development Is away from centrally designed, implemented and maintained systems toward The integration of independently designed, implemented and maintained system elements.
Data systems generally involve some combination of the following elements Discovery Analysis/ Visualization Archive Access/ Delivery
Historically these elements or subsets thereof have been developed and managed by the group assembling the data system.
Discovery Analysis/ Visualization Data System A Archive Access/ Delivery
Discovery Analysis/ Visualization Analysis/ Visualization Analysis/ Visualization Analysis/ Visualization Data System B Archive Access/ Delivery
With the advent of the Internet data system elements are increasingly being designed, implemented and managed independently by different groups.
Discovery Discovery Access/Delivery Application/ Visualization With a Plethora of System Elements … Archive Archive … Access/Delivery .. Application/ Visualization
And “complete” data systems being created from different combinations of the elements
GFDL netCDF URI HDF GSFC Binary Matlab IDL GrADS Ferret IDV Excel VisAD Access ncBrowse An example: NVODS – The National Virtual Ocean Data System … OPeNDAP …
More NVODS – Data Discovery GFDL netCDF URI HDF GSFC Binary IDL IDV Access ncBrowse Matlab GrADS Ferret VisAD Excel GCMD NVODS OPeNDAP
NVODS – Defining the Data System GFDL netCDF URI HDF GSFC Binary Matlab Access IDL ncBrowse IDV VisAD Ferret GrADS Excel ODC GCMD NVODS OPeNDAP
Responsibility Note that in distributed systems responsibility is distributed. For NVODS responsibility for • The data lies with the data providers. • The data access protocol lies with OPeNDAP. • Application packages (Matlab, Ferret, Excel…) with the developers of these packages. • Data location with the GCMD and NVODS.
Interoperability And Metadata
The Ultimate Objective of a Data System • To provide requested data to the user’s analysis/visualization package in a consistent, readily useable form. For example: A user might want all ocean temperature values (with associated times and locations) that lie between 90 and 110 m and have uncertainties less than 1ºC.
Interoperability • Know the format of these data objects. To achieve this objective, system elements must interoperate; i.e., the system must: Be capable of finding all data of interest. • Be capable of transforming from any of these formats to that required by the application software. • Understand the semantics of the data.
Metadata The degree of system interoperability is determined by the associated metadata. These interoperability requirements require in turn descriptions of the data; i.e., metadata.
Syntactic and Semantic Metadata The required metadata falls in two classes: • Syntactic metadata – Informationabout the data types and structures at the computer level - the syntax of the data; e.g., variable T is a 20x40 element floating point array • Semantic metadata – Information about the contents of the data set. e.g., variable T is sea surface temperature with units of ºC [RBH Note: This syntax is actually data structure. Syntax is the format of request.]
T 10 20 30 40 Syntactic and Semantic Metadata • Syntactic metadata provides the information needed to read and plot the data, but in general not to label the axes.
Syntactic and Semantic Metadata • Semantic metadata provides the information needed to label the axes in a plot. Temperature (ºC) 0 4 8 12 16 18 • 20 30 40 • Time (Hours)
Interoperability and OPeNDAP The two types of metadata suggest two levels of interoperability: • Syntacticinteroperability – Consistent format representation across data sets. • Semantic interoperability – Consistent semantic interpretations of the data. OPeNDAP mandates syntactic interoperability via a strict syntactic description of all data available via the system.
Interoperability and OPeNDAP OPeNDAP does not mandate semantic interoperability although it does allow for it. • The OPeNDAP data access protocol supports containers for semantic metadata, but places no requirements on the contents of these containers. • Some data sets are well described, others are not.
OPeNDAP in NVODS GFDL netCDF URI HDF GSFC Binary Matlab Access IDL ncBrowse IDV VisAD Ferret GrADS Excel ODC GCMD NVODS OPeNDAP
Data Selection • The OPeNDAP data access protocol allows for data subsetting at the server. • All data sets may be subsetted by variable, by the value of a variable for sequences and by array index for arrays. • If the data sets semantics are known, the data may be subsetted based on these semantics.
Discipline Neutrality • The OPeNDAP data access protocol is discipline neutral. It provides for • Consistent syntactic representation of the data regardless of the semantics. • Discipline specific semantics may be layered on the data.
Discipline Neutrality • Because of its discipline neutrality it is being used in a variety of data systems: • Oceanography • Meteorology • Solar-terrestrial physics • Hydrology
OPeNDAP Client and Server Status Flat Binary netCDF HDF4 Matlab DSP Tables SQL FITS CDF CEDAR Data Data Data Data Data Data Data Data Data Data netCDF Matlab JGOFS FITS FreeFrom HDF4 DSP JDBC CDF CEDAR OPeNDAP Data Connector IDL Client Matlab Client netCDF Java netCDF C Web Browser Ferret GrADS IDV VisAD ncBrowse Matlab IDL Excel
Center for Ocean-Land-Atmosphere Studies Statistics
Interesting OPeNDAP Access Statistics LDEO data accesses for 1st quarter of 2002 LDEO data accesses for 4th quarter of 2002
Conclusions • Responsibility in distributed data systems of the future will be distributed among the developers of the elements comprising the system. • To maximize the flexibility in assembling systems, system elements should remain as general as possible. • The OPeNDAP data access protocol has been designed to be discipline neutral in anticipation of the shift in data system development.
Conclusions Data systems based on the integration of independently developed system elements offer many more opportunities than more traditional centrally developed ones.
Conclusions “When systems evolve separately it is like sexual reproduction as opposed to centrally developed systems which evolve asexually” J. Caron 2003