190 likes | 200 Views
THREDDS Data Server (TDS) and Data Discovery. John Caron Unidata/UCAR May 15, 2006. THREDDS Data Server. OAI Harvester. HTTP Tomcat Server. OAI Provider. DL Records. Catalog.xml. THREDDS Server. Application. NetCDF-Java (CDM) library. OPeNDAP. HTTPServer. WCS. Datasets.
E N D
THREDDS Data Server (TDS)and Data Discovery John Caron Unidata/UCAR May 15, 2006
THREDDS Data Server OAI Harvester HTTP Tomcat Server OAI Provider DL Records Catalog.xml THREDDS Server Application NetCDF-Java (CDM) library • OPeNDAP • HTTPServer • WCS Datasets otherhost.gov OPeNDAP Server hostname.edu
Collection vs Inventory Datasets Catalog /models/ncep/NAM/ File1.grib File2.grib File3.grib DatasetScan Dataset Dataset Dataset Dataset Dataset Dataset Dataset Dataset Dataset Dataset http://motherlode.ucar.edu:8080/thredds/dodsC /model/NCEP/DGEX/CONUS_12km/file.grib2
DL Harvesting Catalog /models/ncep/NAM/ File1.grib File2.grib File3.grib DatasetScan Dataset Dataset Dataset Metadata Record Dataset Dataset Dataset Metadata Record isHarvest = true inherit = true Dataset Dataset
Metadata Information • Title / Summary • Publisher / Creator / Rights • Lat/Lon bounding box • Time range • Relative time: “latest 7 days” • Variable names • DLESE : no (not dataset oriented) • GCMD: controlled list, required • Unique ID/ Resource URL
Why not harvest Inventory? • Too many of them, eg in IDD: • NCEP models: 28 collections, 6000 files • NEXRAD level 3 files: ~8M files • Real-time datasets are never current • DLs (GCMD, DLESE) don’t want them • Collection search in DL, browse inventory on server.
Current Work: Aggregation • Make many files into single logical dataset: Make Collection Dataset = Inventory • Uses NcML to read into CDM, works at the “syntactic” level. • Replaces older “Aggregation Server” • Union • Join on existing dimension • Join on new dimension
TDS/NcML Aggregation <dataset name="WEST-CONUS_4km Aggregation" urlPath="satellite/3.9/WEST-CONUS_4km"> <netcdf xmlns=“http://www.unidata.ucar.edu/schemas/netcdf/ncml-2.2”> <aggregation dimName="time" type="joinExisting"> <scan location=“C:/data/goes/" suffix=".gini"/> </aggregation> </netcdf> </dataset>
Next: DataType Aggregation • Work at the CDM DataType level, know (some) data semantics • Forecast Model Collection • Combine multiple model forecasts into single dataset with two time dimensions • With NOAA/IOOS (Steve Hankin) • Point/Station/Trajectory/Profile Data • Allow space/time queries, return nested sequences • Start from / standardize “Dapper conventions”
Forecast Model Collections
Web services for discovery • “Latest dataset” Resolver service • Dataset Query Capability (DQC) : accept query, return results as a collection of datasets in a catalog • Future: Dynamic dataset creation based on user query ??
Summary • Expect discovery to be 2 phased: • Search for collections in DL with browser • Use an application like the IDV (OPeNDAP) or GIS client (WCS) to drill down to the actual data. • Expect aggregation / query will (eventually) tame the “inventory problem”
Dataset Query Capability Document • XML document that describes the set of valid queries for a dataset. Queries are URLS: http://www/dqc/radar?stn=ABR&product=NOR&time=1hour • Selectors: • List of choices • List of stations • Numeric range (point or subrange) • DateRange • Latitude/Longitude Bounding Box • Orthogonal selections (except Lists can be nested) • Returns a catalog containing inventory datasets.
<selectStation id="station" title="Stations" > <station name="AK" value="ABC"> <location latitude="60" longitude="161"/> </station> <station name="SD" value="ABR“ <location latitude="45" longitude="-98.4"/> </station> </selectStation> <selectFromDateRange id="datePnt" title="Date“ selectType="point"start="2004-04-01T00:00" end="2004-04-15T12:00" /> </queryCapability > <queryCapability> <query base="http://www/dqc/radar"/> <selectList id="prod" title=“Parameters“> <choice name=“reflect“ value="N0R"> <description>.5u reflectivity</description> </choice> <choice name="velocity" value="N0S"> <description>.5u storm rel. velocity </description> <selectList id="time" title="Times“> <choice name="Latest“ value="latest"/> <choice name="LastHour“ value="1hour"/> </selectList> </choice> </selectList> Example DQC
Issues • DQC itself doesn’t deal with the query http://www/dqc/radar?stn=ABR&product=NOR&time=1hour • Queries are expressible as param=value • Extend to arbitrary URLs (token substitution), eg dods • SOAP RPC? • Returns a catalog, might be the data itself. • Prototype/non-standard, need buy-in from clients to bother continuing.