1 / 53

Unidata’s Common Data Model and the THREDDS Data Server

Unidata’s Common Data Model and the THREDDS Data Server. John Caron Unidata/UCAR, Boulder CO Jan 6, 2006 ESIP Winter 2006. Outline. Definitions Creating a Common Data (Access) Model from NetCDF, HDF5, OPeNDAP CDM Coordinate Systems, Data Types CDM implementation

calais
Download Presentation

Unidata’s Common Data Model and the THREDDS Data Server

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Unidata’sCommon Data Modeland theTHREDDS Data Server John Caron Unidata/UCAR, Boulder CO Jan 6, 2006 ESIP Winter 2006

  2. Outline • Definitions • Creating a Common Data (Access) Model from NetCDF, HDF5, OPeNDAP • CDM Coordinate Systems, Data Types • CDM implementation • NetCDF Markup Language (NcML) • The THREDDS Data Server

  3. NetCDF-3 • Machine and OS independent file format for “self-describing” scientific data • C library (Fortran, C++, Perl, IDL, MatLab, Python, Ruby), Java library • Efficient subsetting of multidimensional arrays. • > 20,000 downloads last year

  4. HDF5 • Machine and OS independent file format for “self-describing” scientific data • C library (Fortran, Java, PyTables) • Evolution from HDF4, but different. • HDF-EOS, HDF5-EOS, standard formats for EOSDIS, ASCI, NPOESS • Parallel-IO, chunked storage, compression filters, many data types. • Developed at NCSA, now independent

  5. NetCDF-4 • Project funded by NASA to create new version of netCDF using the HDF5 file format. • “Extend and merge” netCDF and HDF5 • Widespread use and simplicity of netCDF • Generality and performance of HDF5

  6. NetCDF-Java 2.2 (nj22) • 100% Java library • Prototype implementation of CDM • File formats: • General: NetCDF, HDF5, OPeNDAP • Grids: GRIB1, GRIB2 • Radar: NEXRAD, NIDS, DORADE • Satellite: DMSP, GINI • Access to THREDDS catalogs

  7. OPeNDAP • Client-server protocol for scientific data access • C++ client and server, Java client and server libraries. • Current version 2.0; NASA ESE standard • Working on new 4.0 protocol spec

  8. THREDDS • Originally funded by NSDL • “discovery and use of scientific data” • Middleware between data providers and users • Dataset Inventory Catalogs (XML) • Now part of Unidata core funding • Data Serving (pull)

  9. What’s a Data Model? • Its about scientific data: storing, accessing • It’s an abstraction • Equivalent to an abstract object model in OOP • An Abstract Data Model describes data objects and what methods you can use on them

  10. An API is the interface to the Data Model for a specific programming language A file format is a way to persist the objects in the Data Model. A data access protocol plays the role of a file format. The Abstract Data Model removes the details of any particular API and the persistence format. What’s a Data Model?

  11. Creating a Common Data Access Model from NetCDF, HDF5, OPeNDAP

  12. NetCDF-3 Data Model

  13. OPeNDAPDataModel(DAP-2)

  14. HDF5 Data Model

  15. CommonData(Access) Model

  16. Coordinate Systemsand Scientific Data Types

  17. Scientific Datatypes Point Trajectory Station Radial Grid Swath Common Data Model Layers Coordinate Systems Data Access

  18. Coordinate Systems needed • NetCDF, OPeNDAP, HDF data models do not have integrated coordinate systems • so georeferencing not part of API • Need conventions to specify (eg CF-1, COARDS, etc) • Contrast GRIB, HDF-EOS, other specialized formats • Must be done in a general way

  19. Coordinate Systems • Same underlying mathematics as VisAD, ASCII

  20. Scientific DataTypes • Based on datasets Unidata is familiar with • APIs are evolving • How are data points connected? • Intended to scale to large, multifile collections • Intended to support “specialized queries” • Space, Time • Corresponding “standard” NetCDF file conventions

  21. Point Observation Data

  22. PointObsDataset Methods // Collection of StructureData Collection getData( LatLonRect boundingBox, Date start, Date end);

  23. Trajectory Data

  24. TrajectoryObs Methods int getNumPoints(); StructureData getData(int point);

  25. Station Data

  26. StationObs Methods // return List of Station List getStations(); // return List of StructureData List getData( Station s, Date start, Date end);

  27. Radial Data

  28. Radial methods interface Radial { int getNumGates(); float getData(int gate); float getStartingGate(); float getGateSize(); float getElevation(); float getAzimuth(); double getTime(); }

  29. Gridded Data

  30. Grid methods interface GridCoordSys { CoordinateAxis getTaxis(); CoordinateAxis getXaxis(); CoordinateAxis getYaxis(); CoordinateAxis getZaxis(); Projection getProjection(); } Array getDataCube(Range time, Range z, Range y, Range x);

  31. Image/Swath

  32. Standardizing NetCDF Formats • Grid: CF-1 Convention • Need improvements for regional models (WRF), GIS info • Radar: “Radar Exchange Format” • With radar community (led by NCAR ATD) • Point Observations • Unidata Observation Dataset Conventions

  33. CDM implementations: NetCDF-4 and NetCDF-Java 2.2

  34. netCDF-3 Interface netCDF-4 Library HDF5 Library NetCDF-4 C Library NetCDF-4 C Library

  35. NetCDF-4 Status • 4.0 Beta implements CDM access layer • complete, but waiting for HDF5 release 1.8 to finalize file format • 4.1: adding Coordinate Systems • 4.?: merge OPeNDAP access (pending funding)

  36. NetCDF-Java 2.2 (nj22) • Prototype implementation of CDM • File formats: • General: NetCDF, HDF5, OPeNDAP • Grids: GRIB1, GRIB2 • Radar: NEXRAD, NIDS, DORADE • Satellite: DMSP, GINI • Access to THREDDS catalogs • Implements NcML

  37. Scientific Datatypes Point Trajectory Station Radial Grid Swath Common Data Model Coordinate Systems Data Access

  38. THREDDS Catalog.xml Application Scientific Datatypes Datatype Adapter NetCDF-Java version 2.2 architecture NetcdfDataset CoordSystem Builder NetcdfFile ADDE I/O service provider OPeNDAP NetCDF-3 NIDS NetCDF-4 GRIB HDF5 GINI Nexrad DMSP …

  39. NetCDF-Java 2.2 Status • Data Access layer: Beta quality • also waiting for HDF5 release to finish NetCDF-4, commit to API • Coordinate Systems: early Beta • Finishing docs, runtime plugability • Data Types: Alpha, still experimenting with APIs

  40. NetCDF Markup Language (NcML) • XML representation of netCDF metadata (like ncdump -h) • Create new netCDF files (like ncgen) • Modify existing datasets • Add/delete/rename • Create logical sections of existing variables. • Create unions and aggregations of multiple existing datasets.

  41. NcML example <?xml version="1.0" encoding="UTF-8"?> <netcdf xmlns="http://www.unidata.ucar.edu/schemas/netcdf/ncml-2.2" location=“/data/nids/N0R_20041119_2147"> <attribute name=“DataType" value=“Radar" /> <remove type=“attribute” name=“password" /> <variable name="Reflectivity" orgName=“R34768”> <attribute name="units" value=“dBZ" /> </variable> </netcdf>

  42. = + + + = NcML Aggregation • Union • Join Existing • Join New • Forecast Model Run

  43. NcML Aggregation Example <netcdf xmlns=“http://www.unidata.ucar.edu/schemas/netcdf/ncml-2.2”> <aggregation dimName="time" type="joinNew"> <variableAgg name="Temperature"/> <variableAgg name="Pressure"/> <scan location=“C:/data/goes/" suffix=".gini"/> </aggregation> </netcdf>

  44. THREDDS Data Server • Integrates data access with THREDDS catalogs and services • Tomcat/Servlet, 100% Java, single war file • Data input is netCDF Java 2.2 library • Data output: • OPeNDAP • HTTP Server • OGC Web Coverage Server (gridded)

  45. THREDDS Data Server HTTP Tomcat Server Catalog.xml Application THREDDS Server • OPeNDAP • HTTPServer • WCS NetCDF-Java library hostname.edu Datasets IDD Data

  46. TDS as WCS Gateway hostname.edu HTTP Tomcat Server Catalog.xml Application THREDDS Server • OPeNDAP • HTTPServer • WCS NetCDF-Java library OPeNDAP Server anotherHost.org

  47. NcML TDS and NcML hostname.edu HTTP Tomcat Server Application THREDDS Server • OPeNDAP • WCS Netcdf-Java Catalog.xml Datasets

  48. TDS and NcML • Server serves the dataset “wrapped” by the NcML • Client sees OPeNDAP or WCS, not NcML • Can “fix” metadata problems • Can augment metadata • Use NcML aggregation on the TDS • replaces the old “Aggregation Server”

  49. OAI Harvester DL Records TDS and Digital Libraries HTTP Tomcat Server Catalog.xml Application THREDDS Server • OPeNDAP • HTTPServer NetCDF-Java library • WCS Datasets hostname.edu otherhost.gov OPeNDAP Server

  50. TDS and Digital Libraries • Framework to add metadata • By hand (collection level) • Automatic extraction from datasets • Send records to existing DLs • No search • Both collection and inventory level

More Related