250 likes | 371 Views
Recent Work in Progress John Caron, June 3, 2003. THREDDS development Dynamic Catalogs: DQC, Resolvers IDD Data Server ADDE Cataloger NetCDF development NetCDF Markup Language (NcML) More efficient Java I/O (NIO) NetCDF/DODS/HDF5 Data Models. CatalogRef.xml. Catalog Generator.
E N D
Recent Work in Progress John Caron, June 3, 2003 • THREDDS development • Dynamic Catalogs: DQC, Resolvers • IDD Data Server • ADDE Cataloger • NetCDF development • NetCDF Markup Language (NcML) • More efficient Java I/O (NIO) • NetCDF/DODS/HDF5 Data Models
CatalogRef.xml Catalog Generator CatalogRef.xml THREDDS Catalogs HTTP Server Catalog.xml Data Server DODS, ADDE, FTP, HTTP Client Application Datasets hostname.edu
Catalog Service Catalog Generator CatalogRef.xml Query Resolver Service DQC.xml Resolver Service URI URL Dynamic Catalogs = Services HTTP Tomcat Server Catalog.xml Client Application Data Server DODS, ADDE, FTP, HTTP Datasets hostname.edu
Dataset Query Capability (DQC) • XML document. • Describes what the user can ask for as a set of orthogonal “selections”. • On the client, a “query URL” is formed based on the user’s choices, and sent to the server. • The “query resolver” server finds which datasets satisfy the query and returns a list of real dataset URLs. • The DQC describes the queries that the server is capable of responding to.
Resolver Services • Logical Dataset, eg “latest ETA model run” • Dataset with Service type “Resolver” • On the client, the URI of the logical dataset is sent to the server • The server finds what is available and returns a list of real dataset URLs.
Query Resolver Service DQC.xml Xxxxx Xxxxx Xxxx ADDE Cataloger HTTP Tomcat Server Catalog.xml Catalog Service ADDE Cataloger CatalogRef.xml Client Application ADDE Data Server hostname.edu Datasets IDD
Summary IDD Data Server Get as much of the IDD Data feeds available via THREDDS as possible. • NCEP model data (catgen) (DODS) • Level 3 NEXRAD (custom server/DQC) (ADDE) • SSEC/Unidata Satellite data (ADDE Cataloger) (ADDE) • Text Data: Metars, Surface Obs, etc (DQC/custom server), returns text or XML. • Profiler Data (custom server/DQC) (ADDE)
OpenDAP Dataset OpenDAP protocol NcML Dataset XML Virtual dataset NetCDF 3 NetCDF File Local file HTTP protocol NetCDF-3 library API Client Application
NetCDF Markup Language XML representation of netCDF metadata, uses XML Schema • Core: existing netCDF data model • Coordinate System: general and georeferencing coordinate system • Dataset: redefine, aggregate, subset • Luca Cinquini (NCAR/SCD/ESG), John Caron, Ethan Davis, Bob Drach (LLNL), Stefano Nativi (Florence), Russ Rew
NcML Coordinate Systems • Convention Parser • ATDRadar • AWIPS • COARDS • CF • CSM • GDV • NUWG • WRF • Zebra NetCDF File OpenDAP Dataset Netcdf Dataset NcML Dataset XML
GeoGrids, GeoTiffs, Geowhiz! NetCDF File OpenDAP Dataset Convention Parser Netcdf Dataset GeoGrid factory VisAD / IDV GeoGrid Dataset WCS Server GeoTiff Writer Strange land of GIS OpenGIS WCS GeoTiff File
NcML Dataset : “virtual view” NetCDF File OpenDAP Dataset NcML Dataset XML Dataset XML Parser Java-netCDF 2.1 Client Application NetCDF Dataset
NcML Dataset • Use NcML like CDL, to declare the contents of a netCDF file. • Add, delete or rename Variables, Attributes, and Dimensions • Subset Variables • Reorder a Variable’s dimensions • Aggregate multiple netCDF files, a la DODS Aggregation Server • NcML Dataset is a “virtual view” or can make copy to a local netCDF file.
2: NcML Datasets on a Server Catalog.xml DODS Agg/Netcdf Server DODS, ADDE, FTP, HTTP Dataset XML Parser Client Application NcML Dataset XML Datasets hostname.edu
3: NcML Datasets via Catalogs Catalog.xml NcML Dataset XML NetCDF File OpenDAP Dataset Catalog/Dataset XML Parser Java-netCDF v 2.1.1 Client Application
NIO • Rewrite ucar.nc2 I/O layer using java.nio package (currently using ucar.netcdf) • Uses memory mapping, bulk I/O transfer • Prototype has 7x speedup on large files. • Requires JDK 1.4+ • HTTP access must be rewritten
NIO vs current Java NIO Current old/new First access small (3.9 Mb) 281 671 2.4 large (240 Mb) 3334 28221 8.5 Average next 5 accesses small 54 290 5.4 large 2239 16367 7.3 • Time in millisecs to sequentially read entire file • Wintel 2GHz, 1 GB main memory • Java 1.4.2 -client
NIO vs optimized C NIO C C/NIO First access small (3.9 Mb) 281 370 1.3 large (240 Mb) 3334 19348 5.8 Average next 5 accesses small 54 24 .44 large 2239 953 .43 • Java 1.4.2 –client vs. VC 6.0 /O2
NetCDF Data Model NetcdfFile Dimension Variable Attribute Attribute • DataType • byte • char • short • int • float • double
array Dimension BaseType OpenDAP Data Model • BaseType • primitive (8) • string • array • grid • structure • sequence Dataset BaseType Attribute Attribute structure / sequence Attribute BaseType Attribute
Dataset Datatype DataSpace Attribute HDF5 Data Model • Datatype • Fixed point • floating point • date/time • string • bit field • Opaque • Compound • Reference • Enumeration • Variable length • Array Groups File directory structure inside HDF file. • Data storage • Compact • External • Layout • Indexed • Striped
Possible Extensions to netCDF data model • Add new data types: • Strings: variable length arrays of bytes, plus an encoding attribute. • Structures: collections of any other element types, allow nested structures. • Vector: a variable length 1D array of any type. • Allow reusable structure definition = user defined data type. • Allow unnamed, undeclared dimensions = anonymous dimensions. • Allow multiple unlimited dimensions (outer dimension only) • Compression. Push scale/offset into library, allow variable bit sizes. • Explicit support for coordinate variables/axes.
New NetCDF Data Model NetcdfFile Variable Structure • DataType • byte • short • int • long • float • double • String • Structure • Vector Dimension DataType DataType DataType • Vector • Length Attribute DataType Attribute
NetCDF 4 NetCDF V.1 and 2 File OpenDAP Dataset HDF5 File OpenDAP 4.0 protocol Local file or HTTP protocol NcML Dataset XML NetCDF 4 library API Virtual dataset Client Application