250 likes | 381 Views
THREDDS and Digital Library Searching. IDD Data Push. IDD Data. Local Data Manager (LDM). Decoders. McIDAS Application. Files. GEMPAK Application. Other Applications. IDD Data. Local Data Manager (LDM). Decoders. IDV Application. Pull Servers. Files. McIDAS Application.
E N D
IDD Data Push IDD Data Local Data Manager (LDM) Decoders McIDAS Application Files GEMPAK Application Other Applications
IDD Data Local Data Manager (LDM) Decoders IDV Application Pull Servers Files McIDAS Application ADDE Server (SSEC) Archival data OPeNDAP (UnivRI) NetCDF Application
THREDDS Goals • Funded through NSDL initiative • Make real scientific data “available” in Digital Libraries • Work closely with NSDL, DLESE • have them provide the search interface • NASA Global Change Master Directory (GCMD) furthest along
Challenges • No description of what data is available from the servers • Digital Libraries want text records, not binary datasets • What are users searching for? • What can users do once they “find what they are looking for”?
Collections vs. Inventory • Too many inventory files, eg in IDD: • NCEP models: 28 collections, 6000 files • NEXRAD level 3 files: ~8M files • Real-time datasets are never current • DLs (GCMD, DLESE) don’t want them
THREDDS Catalogs • XML documents over HTTP • Hierarchical listing of online resources (datasets) • Inventory files are contained in collection datasets • Collection search in DL, browse inventory on server. • Container for arbitrary search metadata • Standard set maps to DC, GCMD, ADN • Metadata can be inherited • Design goal: Make it easy for data providers
Further Development • Serve the data, too • Tackle granularity problem by creating logical datasets • Aggregations of files • Harvest DL records directly from catalogs • Put links in DL records back to the catalog • User can start up “helper app” (IDV) to visualize data
THREDDS Data Server HTTP Tomcat Server catalog.xml Application THREDDS Server • WCS • OPeNDAP • HTTPServer • NetcdfSubset NetCDF-Java library motherlode.ucar.edu Datasets IDD Data
THREDDS OPeNDAP Server • Current version 2.0; NASA ESE standard • Working on new 4.0 protocol spec • Based on Java-OPeNDAP library • shared development by Unidata/opendap.org • Any CDM dataset can be served • Server4 (Hyrax): • latest version of opendap.org C++ library • uses THREDDS catalog generation code • THREDDS Catalogs replace dods_dir
THREDDS WCS Server • Open Geospatial Consortium (OGC) specification for GIS clients • Ongoing merging with ISO • For gridded data only • Return formats • GeoTIFF: floating point, greyscale • NetCDF / CF-1.0 (same as NetcdfSubset Service)
Common Data Model HTTP Tomcat Server catalog.xml Application THREDDS Server • WCS • OPeNDAP Then a miracle happens • HTTPServer • NetcdfSubset NetCDF-Java library hostname.edu Datasets IDD Data
THREDDS Catalog.xml Application Scientific Datatypes Datatype Adapter NetCDF-Java version 2.2 architecture NetcdfDataset CoordSystem Builder ADDE NetcdfFile I/O service provider OPeNDAP NetCDF-3 NIDS NcML NetCDF-4 GRIB HDF5 GINI Nexrad DMSP …
I/O Service Provider Implementations • General: NetCDF, HDF5, OPeNDAP • Gridded: GRIB-1, GRIB-2 • Radar: NEXRAD level 2 and 3, DORADE, Chinese NEXRAD • Point: BUFR, ASCII • Satellite: DMSP, GINI, McIDAS AREA • In development / tentative • NOAA CLASS legacy files • Barrowdale DataBlade
CDM Payoff N + M instead of N * M things on your TODO List! File Format #1 Visualization &Analysis NetCDF file File Format #2 OpenDAP Server File Format #N WCS Service Web Service
Digital Library (OAI) Harvesting OAI Harvester HTTP Tomcat Server OAI Provider DL Records Catalog.xml THREDDS Server Application NetCDF-Java (CDM) library • OPeNDAP • HTTPServer • WCS Datasets otherhost.gov OPeNDAP Server hostname.edu
Current Status • THREDDS Data Server • A good framework for pull data services • Focus on dataset aggregation • Small number of datasets currently available on motherlode (30) • Other TDS early adopters have a few hundred more • Metadata quality uneven
How to deliver the data? • What data protocols will users need? • OPeNDAP has a fair amount of current use in atmospheric/ocean community • OGC still immature, but EU insisting on formal standards • ADDE use restricted to McIDAS community • What data formats do users need? • Strategy is to promote netCDF/CF-1.0
What to do next? • Community annotation of datasets • “Datapedia” • Unidata Data Czar • Domain expert • Add metadata to known datasets • We will need a search strategy • DLESE, NSDL probably wont solve this • GCMD probably useful • Working with International partners • NCAR Community Data Portal • British Atmospheric Data Center