1 / 33

OPeNDAP: Accessing Data in a Distributed, Heterogeneous Environment

This presentation discusses OPeNDAP and NVODS, two entities formed for accessing data in a distributed and discipline-specific system. It covers topics such as metadata, interoperability, and lessons learned from the OPeNDAP/NVODS effort.

Download Presentation

OPeNDAP: Accessing Data in a Distributed, Heterogeneous Environment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OPeNDAP: Accessing Data in a Distributed, Heterogeneous Environment Peter Cornillon Graduate School of Oceanography University of Rhode Island Presented at the NSF Sponsored Cyberinfrastructure Meeting 31 October 2002

  2. Distributed Oceanographic Data System DODS consisted of two fundamental parts: • a discipline independent core infrastructure for moving data on the net, • a discipline specific portion related to data–population, location, specialized clients, etc.

  3. DODS  OPeNDAP & NVODS To isolate the discipline independent part of the system from the discipline specific part, two entities have been formed: • Open Source Project for a Network Data Access Protocol (OPeNDAP) • National Virtual Ocean Data System (NVODS)

  4. The Core InfrastructureInteroperability

  5. Interoperability - Metadata The degree to which machine-to-machine interoperability is achieved depends on the metadata associated with the data.

  6. OPeNDAP and Metadata

  7. Metadata Types We define two classes of metadata: • Search metadata – used to locate data sets of interest in a distributed data system. • Use metadata –needed to actually use the data.

  8. Use Metadata We divide use metadata into two classes: • Syntactic use metadata Semantic use metadata

  9. Syntactic Use Metadata Information about the data types and structures at the computer level - the syntax of the data; • e.g., variable T represents a 20x40 element floating point array.

  10. Semantic Use Metadata Information about the contents of the data set. e.g., variable T represents • sea surface temperature • with units of ºC

  11. SemanticUse Metadata We divide semantic use metadata into two classes: • Translational Semantic Use Metadata • Descriptive Semantic Use Metadata

  12. Translational Semantic Use Metadata • Metadata required to make use of the data; e.g., to properly label a plot of the data • Define the translation from received values to semantically meaningful values • Examples • Variable names in the data set: t  SST • Units of the data: 0.125 C+4  C • Missing value flags: -99  missing value

  13. OPeNDAP and Metadata

  14. OPeNDAP - NVODS Status

  15. OPeNDAP Server Sites OPeNDAP/NVODS Server Sites

  16. OPeNDAP Client and Server Status

  17. Special Servers

  18. Lessons (Re)Learned

  19. Lessons (Re)Learned 1. Modularity provides for flexibility The more modular the underlying infrastructure the more flexible the system. This is particularly important for network based systems for which the technology, software and hardware, are changing rapidly.

  20. Lessons (Re)Learned 2. Data of interest will be stored in a variety of formats. Regardless of how much one might want to define the format to be used by system participants, in the end the data will be stored in a variety of formats. 2a. The same is true of translational use metadata!

  21. Lessons Learned 3. Structural representation of sequence data sets is a major obstacle to interoperability Care must be given to the organizational structure (as opposed to the format) of the data. This is the single largest constraint to the use of profile data in NVODS.

  22. Lessons (Re)Learned 4. “Not invented here” Avoid the “not invented here” trap. The basic concepts of a data system are relatively straightforward to define. Implementing these concepts ALWAYS involves substantially more work than originally anticipated. The “Devil’s in the details”. Take advantage of existing software wherever possible.

  23. Lessons (Re)Learned 5. Work with those who adopt the system for their own needs. Take advantage of those who are interested in contributing to the system because the system addresses their needs as opposed to those who are simply doing the work for the associated funding. => Open source.

  24. Lessons Learned 6. There is no well defined funding structure for community based operational systems. It is much easier to obtain funding to develop a system than it is to obtain funding to maintain and evolve a system. This is a major obstacle to development of a stable cyberinfrastructure that meets the needs of the research community.

  25. Lessons Learned 7. It is relatively more difficult to obtain funding for applied system development than for research related to data systems. This is another obstacle to the development of cyberinfrastructure that meets the needs of the research community.

  26. Lessons (Re)Learned 8. “Tough to teach old dogs new tricks” Introducing new technology often requires a cultural change in usage that is difficult to effect. This can negatively effect system development.

  27. Lesser Lessons Learned 9. Some surprises encountered in the NVODS/ OPeNDAP effort • Heavy within organization usage. • Metadata focus in the past is appropriate for interoperability at the data level. • Number of variables increases almost linearly with the number of data sets. • Users will take advantage of all of the flexibility offered by a system sometimes to the disadvantage of all. • Incredible variability in the structural organization of data.

  28. Lessons Learned 10. Metrics suggest • Increasing use of scripted requests • Large volume transfers As data systems offering machine-to- machine interoperability with semantic meaning take hold, we could well see an explosive growth in the use of the web.

  29. Lessons Learned 11. Time to maturity is order 10 years not 3 Developing new infrastructure takes time, both to iron out all of the %^*% little details and adoption of the infrastructure takes time.

  30. Peter’s Law The more metadata required the less data delivered Of course, the less metadata, the harder it is to use the data

  31. http://unidata.ucar.edu/packages/dodshttp://nvods.org

More Related