270 likes | 390 Views
Best Practices Writing (R ead-only archives of) netCDF (version 3). John Caron Unidata June 28, 2007. Overview. NetCDF solves file syntax ; writer and readers need to agree on semantics Goal 1: intelligible by humans Goal 2: readable by standard tools Write a Conventions document
E N D
Best Practices Writing (Read-only archives of)netCDF(version 3) John Caron Unidata June 28, 2007
Overview • NetCDF solves file syntax; writer and readers need to agree on semantics • Goal 1: intelligible by humans • Goal 2: readable by standard tools • Write a Conventions document • Types of metadata: • Structuralmetadata: ncdump -h • Usemetadata: units, coordinates • Searchmetadata: bounding boxes, time ranges, standard variable names, keywords
Its on the web http://www.unidata.ucar.edu/software/netcdf/docs/BestPractices.html http://www.unidata.ucar.edu/software/netcdf/docs/workshop/bestpractices/index.html
Attributes • Use standard attribute names if possible • netCDF Users Guide, CF-1.0 • Use numeric when appropriate • :calibration = “23.7”; // string • :calibration = 23.7f; // float • Can be multivalued • :special = 23.7f, 10.6f;
Global Attributes • :Conventions = "NCAR-RAF/nimbus"; • Put document on the web, send us a link • Searchmetadata: • bounding boxes, time ranges, keywords • NetCDF Attribute Convention for Dataset Discovery • Many others Sources: • CF-1.0, FGDC, ISO, Dublin Core
Variable Attributes • long_name : human readable plot title • units : udunits compatible • sps vs s-1 • display_units = “NO3 ppm”; • Missing values : • _FillValue “never written” • missing_value • valid_min, valid_max, valid_range
Dimensions • Name: make it meaningful • “vector16” vs “bins16”, “wind_vector”; • char date(vector16) vs date(date_strlen) • Shared dimension imply shared coordinates char date(dim16); float P(time,dim16); // BAD DOG! versus char date(date_strlen); float P(time,bins16); float T(time,bins16); // GOOD BOY!!
Example Conventions: • http://www.unidata.ucar.edu/software/netcdf/conventions.html Debugging Tool: • http://www.unidata.ucar.edu/software/netcdf-java/v2.2/webstart-dev/index.html
Nimbus: Options (1) • Data Types • Allow any datatype • Use scale/offset to save space • Units • Change units to be udunit compatible • Add display_units (?) attributes
Coordinate Variables dimensions: time = 1761; lat = 180; lon = 360; z = 42; variables: int time(time); :units = "seconds since 1970-1-1 0:00:00 0:00"; double lat(lat); :units = “degrees_north”; double lon(lon); :units = “degrees_east”; double z(z); :units = “m”; :positive = “up”; float data(time,z,lat,lon);
Coordinate Variables • Variable name same as dimension name • Strictly monotonic values • No missing values • Simple case: • All coordinates are 1D coordinate variables • Data variables have one dimension for each coordinate: • data(time,z,lat,lon); • Correct rules only for gridded (model) data
Stationary Buoy (first attempt) dimensions: time = unlimited; lat = 1; lon = 1; float data(time, lat, lon); int time(time); double lat(lat); double lon(lon); • Only works when lat=1, lon=1 (single buoy per file)
Multiple Buoys per file float data(buoy,time); int time(time); int buoy(buoy); :long_name = “buoy id”; double lat(buoy); double lon(buoy);
Aircraft (Trajectory) Coordinates float data(pt); int time(pt); double altitude(pt); double lat(pt); double lon(pt);
2D Coordinates float data(time,z,y,x); int time(time); double z(z); double y(y); double x(x); double lat(y,x); double lon(y,x);
Generalize Coordinate Variable to Coordinate Axis • Can be multidimensional • Name can be different from the dimension • A set of axes for a variable is called a Coordinate System • How to associate a Coordinate System with a variable? float data(pt); data:coordinates=“lat lon altitude time”;
Nimbus Coordinates :coordinates = "LONC LATC GGALT Time"; float LONC(Time=7741); :_FillValue = -32767.0f; // float :units = "degree_E"; :long_name = "GPS-Corrected Inertial Longitude"; :valid_range = -180.0f, 180.0f; // float :Category = "Position"; :standard_name = "longitude";
Nimbus: Recommend (2) • Document Coordinates: • All variables have same coordinate system, described by coordinates global attribute • Coordinate variable have standard_name attribute describing coordinate type: latitude, longitude, altitude, time • Are missing values possible? • CF-1.0 units for lat/lon: degrees_east, degrees_north (decimal degrees)
Bin Coordinates float AS200_RWO(Time=7741, sps1=1, Vector31=31); :FillValue = -32767.0f; :units = "count"; :long_name = "SPP-200 (PCASP) Raw Accumulation (per cell) - DMT"; :Category = "PMS Probe"; :missing_value = -32767.0f; :SampledRate = 10; :DataQuality = "Preliminary"; :SerialNumber = "PCAS108"; :FirstBin = 6; // int :LastBin = 30; // int :CellSizes = 0.05f, 0.065f, 0.08f, 0.095f, 0.11f, 0.125f, 0.14f, 0.155f, 0.17f, 0.185f, 0.2f, 0.215f, 0.23f, 0.3f, 0.43f, 0.56f, 0.69f, 0.82f, 0.95f, 1.1f, 1.25f, 1.4f, 1.55f, 1.7f, 1.85f, 2.0f, 2.15f, 2.3f, 2.45f, 2.6f, 2.75f; :CellSizeUnits = "micrometers";
Bin Coordinates (alt) float AS200_RWO(Time=7741, sps1=1, AS200_RWO_BINS=31); :long_name = "SPP-200 (PCASP) Raw Accumulation”; :FillValue = -32767.0f; :units = ""; :display_units = "count"; :coordinates = "LONC LATC GGALT Time AS200_RWO_BINS"; float AS200_RWO_BINS(AS200_RWO_BINS=31); :FirstBin = 6; :LastBin = 30; data: AS200_RWO_BINS = 0.05f, 0.065f, 0.08f, 0.095f, 0.11f, 0.125f, 0.14f, 0.155f, 0.17f, 0.185f, 0.2f, 0.215f, 0.23f, 0.3f, 0.43f, 0.56f, 0.69f, 0.82f, 0.95f, 1.1f, 1.25f, 1.4f, 1.55f, 1.7f, 1.85f, 2.0f, 2.15f, 2.3f, 2.45f, 2.6f, 2.75f;
Bin Coordinates (alt) • Advantages: • Can be written outside of define mode • More likely to be interpreted by standard tools
Station data(same number of pts at each station) float data(station, time); int time(time); double altitude(station); double lat(station); double lon(station);
Station data(different number of pts at each station) dimensions: record = unlimited; char station_name(station, strlen); double altitude(station); double lat(station); double lon(station); int firstChild(station); // record index int numChildren(station); float data1(record); float data2(record); float data3(record); float time(record);
Header variable variable variable variable record record record record record record record record … variable variable variable variable NetCDF-3 file layout Non-record variables Record variables Obs for one station
Unidata Obs Data Conventions • Different number of groups of observations • Nested groups • Linked list, contiguous list • Additional complexity • Performance implication • http://www.unidata.ucar.edu/software/netcdf-java/formats/UnidataObsConvention.html
Conclusions • NCAR-RAF/nimbus Conventions are quite good • Unidata is interested in helping out with future revisions, new formats • Netcdf-4 will offer new options • Standards are evolving – please help! • CF could be standards umbrella