140 likes | 153 Views
In Situ Data Access Some reasons for success or failure. Nancy N. Soreide, Donald W. Denbo NOAA Pacific Marine Environmental Laboratory. IIPS Session 3B. Challenges in Data Access American Meteorological Society January 14-18, 2007, San Antonio TX.
E N D
In Situ Data AccessSome reasons for success or failure Nancy N. Soreide, Donald W. DenboNOAA Pacific Marine Environmental Laboratory IIPS Session 3B. Challenges in Data Access American Meteorological SocietyJanuary 14-18, 2007, San Antonio TX
This paper is an overview of the state of the art in providing access to in-situ data from multiple observing systems over the internet. • Describes reasons for success or failure of technologies, in the spirit of Dr. Richard Feynman: "It's a kind of scientific integrity, a principle of scientific thought that corresponds to a kind of utter honesty--a kind of leaning over backwards.For example, if you're doing an experiment, you should report everything that you think might make it invalid--not only what you think is right about it: other causes that could possibly explain your results; and things you thought of that you've eliminated by some other experiment, and how they worked--to make sure the other fellow can tell they have been eliminated.“ -- Richard Feynman, From a Caltech commencement address given in 1974 Also in Surely You're Joking, Mr. Feynman!
PMEL projects have tried many ways to access data & PMEL has participated in efforts to find a paradigm for delivering NOAA Data • Internet (pre-Web) • Web pages • static graphics & dynamically created graphics and products • Web services • CORBA • OPeNDAP • Dapper (OPeNDAP implementation for in situ data) • This presentation uses examples drawn from PMEL experience with local and NOAA projects
How and why projects failedtopics to be considered • Problem too heterogeneous for a single solution • Generalized vs Specific project objectives • Large project teams vs Small project teams • Over-hyped claims for new technologies • Why good ideas sometimes fail • Failure of funding • Some characteristics of successful projects
Trying to find a paradigmfor delivering NOAA Data • Problem too heterogeneous for a single solution - Recent example: • XML standard for ocean data • Lots of interest over past several years • Still no universally accepted solution • Problem seems to be too big to generalize
Generalized vs SpecializedExample: OPeNDAP as a universal solution • Worked for gridded data • though handicapped by non-uniformity in netCDF implementation • Much harder to make it work for in situ data (i.e., station data) • Still handicapped by non-uniformity in netCDF implementation • No one wants to see a list of several thousand station data file names • Lead to development of the OPeNDAP Dapper server, which is customized for, and becoming the standard for, in situ data • Aided by development of clients which made accessing, graphing and using the data easier (e.g., web client, Matlab, GrADS, etc.) OPeNDAP - http://opendap.org/Dapper - http://www.epic.noaa.gov/epic/software/dapper/DChart - http://dapper.pmel.noaa.gov/dchart/
Big software projects which fail • In comp.risks, big failed projects are often those which are outsourced and which try to solve the whole problem. • Problem is too big • Requirements are changed during development • Features are added during development
Big vs Small • Small projects seem far more likely to succeed than large ones • The most successful web pages delivering ocean data have always been those which are focused on a specific data type or dataset • Examples: • TAO El Nino buoy data, Argo data, LAS for large gridded datasets, EPIC Web and DChart for in situ data TAO - http://www.pmel.noaa.gov/tao/jsdisplay/Argo - http://floats.pmel.noaa.gov/floats/LAS - http://ferret.pmel.noaa.gov/NVODS/servlets/datasetEPIC Web - http://www.epic.noaa.gov/epic/ewb/DChart - http://dapper.pmel.noaa.gov/dchart/
New technology over hyped as a solution for everything • Technology may be quite sound,but the problem may be too big to generalize easily • Example #1: • XML as a formatting standard is powerful and has limitations • XML wrapping does help with parsing and presentation of information. • XML wrapping does not solve problems of different disciplines and sub-disciplines.
New technology over hyped as a solution for everything • Technology may be quite sound,but the problem may be too big to generalize easily • Example #2: • netCDF is an incredibly useful data format • There have been many attempts to standardize the implementation of netCDF for different types of ocean data and model output. • There are still numerous problems • I.e., although there are some de facto standards for implementation of netCDF format for some classes of data, • There is still no widely accepted standard for in situ data
Good Ideas • Many instances of good ideas which did not work out, due to no fault of the developers • Reasons • Users didn’t understand technology • Technology was perceived as too difficult • Availability of off-the-shelf products overtook the developers of a home-grown product • Funding failed or was not consistently sustained so project could not move forward
Failure of funding • A successful project which failed due to failure of funding (historical example): • NOAAServer project brought together all five NOAA Line Offices • First phase created database of metadata with pointers to web pages delivering data • Second phase used CORBA for an on-line system for networked access to centralized services for locating, selecting, graphing and downloading distributed in situ data • Despite successful, working prototypes • Funding was redirected, project stopped • The Phase One database appears to be no longer updated, nor does it seem to have been replaced. • There is still no NOAA data portal like the successful Phase Two prototype
Successful Solutions • Success has often been achieved by breaking the problem into smaller problems which can then be integrated into a larger scale.
In summary, successful projects • Often start with a prototype • Grow incrementally and have constant interaction with a user base • Require consistent, sustained funding over many years or even decades • Some well known, well established examples: • Ferret, GrADS, EPIC, netCDF, OPeNDAP Ferret - http://ferret.wrc.noaa.gov/Ferret/ GrADS - http://www.iges.org/grads/EPIC - http://www.epic.noaa.gov/epic/ewb/netCDF - http://www.unidata.ucar.edu/software/netcdf/OPeNDAP - http://www.opendap.org/