1 / 31

Has Data Management Gone Mainstream?       

Has Data Management Gone Mainstream?       . Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert C. Groman. Talk Overview. Has data management gone mainstream? “Data” is a plural noun = facts, statistics, or items of information.

taite
Download Presentation

Has Data Management Gone Mainstream?       

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Has Data Management Gone Mainstream?        Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert C. Groman BEER Workshop November 9, 2008

  2. Talk Overview • Has data management gone mainstream? • “Data” is a plural noun = facts, statistics, or items of information. • Metadata = motherhood and apple pie • Accessing data: Is a picture worth a thousand bytes? • Data Interoperability BEER Workshop November 9, 2008

  3. Purpose • Raise level of awareness (and appreciation) for data management • “Lighter and informative” • Want to use some formulas • Difference between an engineer and a mathematician BEER Workshop November 9, 2008

  4. Venn Diagram:Data and Metadata All data and information (D) necessary to use the data. Data (d) Facts, statistics, or items of information Metadata (m) D ≠ m + d Set Theory BEER Workshop November 9, 2008

  5. Probability of having all the necessary data and information necessary to reuse someone else's data. • Second order effects: • Length of cruise • Success of cruise • Participants • Immediate activity following the cruise BEER Workshop November 9, 2008

  6. Theorems† • Theorem 1: The probability that all the necessary data and information are collected and preserved to allow another researcher to properly use your data is inversely proportional to the time since the data were collected. • Corollary: Unless data and information are collected and preserved during the experiment (cruise), subsequent researchers will have a difficult time using your data. • Theorem 2: The longer the time since the data were collected the less likely the data will ever be considered “final”. †Proofs are left to the reader as an exercise. BEER Workshop November 9, 2008

  7. Seeing Versus Using Someone’s Data • Maybe you don’t want others to use your data. Hard to believe, but this does happen. For example: • I’m not done publishing my papers based on my data • My graduate student is almost done analyzing the data • It’s not final yet – no, but they still may be useful • My dog ate it (no, I haven’t heard this one yet.) • Old policies and practices about data archiving • New policies about data sharing, data publishing and data archiving • Web accessible • NSF mandate (It is for real this time.) • The sum is greater than its parts BEER Workshop November 9, 2008

  8. The more people use your data the better they get. • Heisenberg Uncertainty Principal (HUP) does NOT seem to apply • If Δx and Δp are the uncertainties in the measurements of the position and momentum, then the product ΔxΔp is at least on the order of Planck's constant. • When measuring conjugate quantities, the product of their standard deviations must be at least h / 4π • Not to be confused with the term observer effect (OE) which refers to changes that the act of observing will make on the phenomenon being observed. BEER Workshop November 9, 2008

  9. Biological and Chemical Oceanography Data Management OfficeBCO-DMO • NSF funded 3 year project to provide short and medium term data management, including web based access, to all NSF funded projects from their biological and chemical oceanographic programs • Large NSF projects are expected to have their own data management offices • Web site: http://www.bco-dmo.org/ BEER Workshop November 9, 2008

  10. Data Stewardship • “a concern for creation and preservation of data and all intermediate phases - focuses …on the management of data over the long term” [Baker and Chandler, 2008]; • Data quality control; • Treatment of all information as data fosters data re-use; • Data that lack sufficient metadata has limited value beyond the research program for which they were collected; and • Metadata should include sufficient information to support discovery, value assessment, and accurate re-use of the data. BEER Workshop November 9, 2008

  11. MapServer interface and interoperability enhancements • Provides access to geo-referenced scientific data and metadata • Presents distributed data sets in a unified way • Uses MapServer as the visualization application • Visualize data with graphics generated on-the-fly • Request custom subsets of measurements in a variety of file formats • Compare data from different sources BEER Workshop November 9, 2008

  12. Interoperability • Ability to get someone else's data and use it on your system. (How easy is this really?) • True interoperability. Get someone else's data and use it directly in your application. Do the units match and do the data acquisition and processing steps match yours or are accounted for, including instrumentation differences? BEER Workshop November 9, 2008

  13. JGOFS/GLOBEC Data Management System BEER Workshop November 9, 2008

  14. http://globec.whoi.edu/map Skip BEER Workshop November 9, 2008

  15. Cruise Tracks BEER Workshop November 9, 2008

  16. Select 5 Cruises BEER Workshop November 9, 2008

  17. Click on “Show Data” Button BEER Workshop November 9, 2008

  18. Select CD data in EN307 BEER Workshop November 9, 2008

  19. Shows stations and optional grid lines BEER Workshop November 9, 2008

  20. EN307 graph it options BEER Workshop November 9, 2008

  21. Depth versus salinity and versus temperature BEER Workshop November 9, 2008

  22. Select another cruise: AL9906 BEER Workshop November 9, 2008

  23. Select MOC1 data set BEER Workshop November 9, 2008

  24. Map it options for abundances BEER Workshop November 9, 2008

  25. Interoperability features (for free) BEER Workshop November 9, 2008

  26. MapServer Supports Interoperability Features • Open Geospatial Consortium standards • Web Mapping Service (WMS), and • Show me the data • Web Feature Service (WFS) • Get me the data • Retains the functionality of the JGOFS/GLOBEC Data Management System • Download data as ASCII, CSV, Matlab, NetCDF BEER Workshop November 9, 2008

  27. Related Activities • MMI – Marine Metadata Interoperability • “Promoting the exchange, integration and use of marine data through enhanced data publishing, discovery, documentation and accessibility." • UNOLS Subcommittee to Report on Best Practices for the Collection of Data and Metadata at Sea to Promote Public Dissemination • Too new to even have its own web site • The Working Group on Zooplankton Ecology (WGZE), with guidance from the Working Group on Marine Data Management (WGMDM), is providing these general metadata guidelines for plankton data collected and submitted to ICES. (2003) • Sensor Interoperability Metadata Workshop (2006) • ICES ASC 2006 and 2008 theme sessions on data management, data sharing and related topics • NOAA Coastal Services Center Data Transport Laboratory (DTL) • Integrated Ocean Observing System (IOOS) • Ocean.US data management and communications (DMAC) strategy • Gulf of Maine Ocean Data Partnership • Many, many more …. BEER Workshop November 9, 2008

  28. Metadata Schema The print size is small to protect the innocent and guilty. BEER Workshop November 9, 2008

  29. What is the difference between an engineer and a mathematician? BEER Workshop November 9, 2008

  30. BEER Workshop November 9, 2008

  31. References • Karen, S. Baker and Cynthia L. Chandler, Enabling long-term oceanographic research: Changing data practices, information management strategies and informatics, Deep-Sea Research II, 55 (2008), 2132-2142. BEER Workshop November 9, 2008

More Related