1 / 45

Introduction to NetCDF4

Introduction to NetCDF4. MuQun Yang The HDF Group. Notes . Require basic knowledge of HDF5 and netCDF3 Cover general NetCDF4 concepts - Several new features and their performances Cover some NetCDF4 APIs but won’t review all new APIs Is not a netCDF3 tutorial. Contents. History review

kassia
Download Presentation

Introduction to NetCDF4

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to NetCDF4 MuQun Yang The HDF Group HDF and HDF-EOS Workshop XI, Landover, MD

  2. Notes • Require basic knowledge of HDF5 and netCDF3 • Cover general NetCDF4 concepts - Several new features and their performances • Cover some NetCDF4 APIs but won’t review all new APIs • Is not a netCDF3 tutorial HDF and HDF-EOS Workshop XI, Landover, MD

  3. Contents • History review • Overview of NetCDF4 features, builds and etc • Performance issues • Suggestions for users HDF and HDF-EOS Workshop XI, Landover, MD

  4. History Review • Funded by NASA ESTO AIST Program • Joint project between Unidata and HDF Group • Used HDF5 as the storage layer of NetCDF HDF and HDF-EOS Workshop XI, Landover, MD

  5. NetCDF-4/HDF5 Goals • Combine desirable characteristics of netCDF and HDF5, while taking advantage of their separate strengths: • - Widespread use and simplicity of netCDF • - Generality and performance of HDF5 • Preserve format and API compatibility for netCDF users • Demonstrate benefits of combination in advanced Earth science modeling efforts (From : Russ Rew etc’s talk at VII HDF and HDF-EOS workshop) HDF and HDF-EOS Workshop XI, Landover, MD

  6. netCDF-3 Interface NetCDF-4 Architecture netCDF-3 applications netCDF-4 applications HDF5 applications netCDF files netCDF-4 Library netCDF-4 HDF5 files HDF5 files HDF5 Library (From : Russ Rew etc’s talk at VII HDF and HDF-EOS workshop) HDF and HDF-EOS Workshop XI, Landover, MD

  7. HDF and HDF-EOS Workshop XI, Landover, MD

  8. Contents • History review • Overview of NetCDF4 features, builds and etc • Performance issues • Suggestions for users HDF and HDF-EOS Workshop XI, Landover, MD

  9. Current Status • http://www.unidata.ucar.edu/software/netcdf/netcdf-4/ • 4.0 beta 1 based on HDF5 1.8 beta 1 on April, 2007 • 4.0 beta 2 release is coming soon HDF and HDF-EOS Workshop XI, Landover, MD

  10. Compilers, platforms and language supports • Platforms • Linux, IBM AIX, Sun OS, HP-UX, OSF1, IRIX, Cygwin • Programming Languages - C/C++ and fortran • Compilers - Vendor compilers on the supported platforms • Watch for Snapshot • http://www.unidata.ucar.edu/software/netcdf/builds/snapshot/netcdf-4 HDF and HDF-EOS Workshop XI, Landover, MD

  11. Configuration • Only NetCDF3 will be built if you just type ./configure • Before building NetCDF4, one must • install HDF5 1.8 beta 1 or later (note: parallel HDF5 needs separate build) • install zlib library if using data compression • To build sequential version - ./configure --enable-netcdf-4 --with-hdf5=/HDF5path --with-zlib=/zlibpath • To build parallel version - ./configure --enable-netcdf-4 –enable-parallel –disable-shared --with-hdf5=/parallel HDF5path --with-zlib=/zlibpath Parallel NetCDF4 needs more work. It has been tested on IBM AIX. HDF and HDF-EOS Workshop XI, Landover, MD

  12. API Changes • Existing APIs: Essentially no differences but with new flags NetCDF3: NetCDF4: • Adding new APIs for new features such as: nc_def_var_deflate(ncid, varid, shuffle, deflate, deflate level) Hereafter blue color in APIS implies this is an output parameter nc_create(FILE_NAME, NC_NOCLOBBER, &ncid); nc_create(FILE_NAME, NC_NETCDF4,&ncid); HDF and HDF-EOS Workshop XI, Landover, MD

  13. Overview of NetCDF4 new features • Data Type - Compound data type • Variable length type • Group • Multiple Unlimited Dimension • Compression • Parallel IO HDF and HDF-EOS Workshop XI, Landover, MD

  14. A compound datatype example types: compound wind_vector_t { float eastward ; float northward ; } dimensions: lat = 18 ; lon = 36 ; pres = 15 ; time = 4 ; variables: wind_vector_t gwind(time, pres, lat, lon) ; wind:long_name = "geostrophic wind vector" ; wind:standard_name = "geostrophic_wind_vector" ; data: gwind = {1, -2.5}, {-1, 2}, {20, 10}, {1.5, 1.5}, ...; HDF and HDF-EOS Workshop XI, Landover, MD

  15. Variable length type Simple example: ragged array types: float(*) row_of_floats; dimensions: m = 50; variables: row_of_floats ragged_array(m); HDF and HDF-EOS Workshop XI, Landover, MD

  16. An Example – variable length and compound datatype struct sea_sounding { int sounding_no; nc_vlen_t temp_vl; } data[DIM_LEN]; /*1. Create a netcdf-4 file. */ nc_create(FILE_NAME, NC_NETCDF4, &ncid); /* 2. Create the vlen type, with a float base type.*/ nc_def_vlen(ncid, "temp_vlen", NC_FLOAT, &temp_typeid); /* 3. Create the compound type to hold a sea sounding. */ nc_def_compound(ncid, sizeof(struct sea_sounding), "sea_sounding", &sounding_typeid); nc_insert_compound(ncid, sounding_typeid, "sounding_no", NC_COMPOUND_OFFSET(struct sea_sounding, sounding_no), NC_INT); nc_insert_compound(ncid, sounding_typeid, "temp_vl", NC_COMPOUND_OFFSET(struct sea_sounding, temp_vl), temp_typeid); /* 4. Define a dimension, and a 1D var of sea sounding compound type. */ nc_def_dim(ncid, DIM_NAME, DIM_LEN, &dimid); nc_def_var(ncid, "fun_soundings", sounding_typeid, 1, &dimid, &varid); /* 5. Write our array of phone data to the file, all at once. */ nc_put_var(ncid, varid, data); /*6. Close the file*/ nc_close(ncid); HDF and HDF-EOS Workshop XI, Landover, MD

  17. Group • Use of Groups is optional, with backward compatibility maintained by putting everything in the top-level unnamed Group. • Unlike HDF5, netCDF-4 requires that Groups form a strict hierarchy. • Potential uses for Groups include • Factoring out common information • Containers for data within regions, ensembles • Organizing a large number of variables • Providing name spaces for multiple uses of same names for dimensions, variables, attributes • Modeling large hierarchies HDF and HDF-EOS Workshop XI, Landover, MD

  18. Group APIs • APIs for creating group( define APIs) nc_def_grp(parent_group_id, group name, &group_id) Examples: nc_def_grp(ncid, HENRY_VII, &henry_vii_id) nc_def_grp(henry_vii_id, MARGARET, &margaret_id) • APIs for inquiring information from a group ( inquiry APIs) number of groups: nc_inq_grps(group_id, &num_grps, NULL); children group id list: nc_inq_grps(group_id, NULL, group_id_list); children group name: nc_inq_grpname(group_id_list[0], children_group_name); HDF and HDF-EOS Workshop XI, Landover, MD

  19. Multiple Unlimited Dimension APIs • APIs for defining multiple unlimited dimensions Old API with the same flag: nc_def_dim(ncid, dimension name, NC_UNLIMITED, int *idp) Examples: nc_def_dim(ncid, dimname_1, NC_UNLIMITED, &dimid[0]) nc_def_dim(ncid, dimname_2,NC_UNLIMITED, &dimid[1]) • APIs for inquiring multiple dimensions Old API with the same flag: nc_inq_unlimdim(ncid,,int *idp) New API: nc_inq_unlimdims(ncid, int nunlimdims_in, int unlimdimid[ ]) • How to use the new API 1) First obtain the number of unlimited dimensions: nc_inq_unlimdims(ncid, &nunlimdims ,NULL) 2) Then obtain the unlimited dimensional list: nc_inq_unlimdims(ncid, &nunlimdims, unlimdimid) HDF and HDF-EOS Workshop XI, Landover, MD

  20. Compression • Deflate now • Scaleoffset, N-bit and maybe szip in the future • Only need to add one routine nc_def_var_deflate( intnetcdf id, intvariable id, intshuffle, int deflate, int deflate_level); HDF and HDF-EOS Workshop XI, Landover, MD

  21. Compression example code ----- Data writing -------- 1. Define variable nc_def_var(ncid, VAR_BYTE_NAME, NC_BYTE, 2, dimids, &byte_varid); 2. Set deflate compression nc_def_var_deflate(ncid, byte_varid, 0, 1, DEFLATE_LEVEL_3); 3. Write the data nc_put_var_schar(ncid, byte_varid, (signed char *)byte_out); ----- Data reading -------- nc_get_var_schar(ncid, byte_varid, (signed char *)byte_in); HDF and HDF-EOS Workshop XI, Landover, MD

  22. Parallel IO • Support either collective or independent • Support MPI-IO or MPI-POSIX IO via parallel HDF5 • Special functions are used to create/open a netCDF file in parallel. HDF and HDF-EOS Workshop XI, Landover, MD

  23. New APIs to do parallel IO • nc_create_par nc_create_par (const char *path, int mode,MPI_Comm comm, MPI_Info info, int *ncidp) “mode” must be NC_NETCDF4|NC_MPIIO or NC_NETCDF4|NC_MPIPOSIX • nc_var_par_access nc_var_par_access (int ncid, int var_id, int data_access ) Data_access can be either NC_COLLECTIVE or NC_INDEPENDENT • nc_open_par nc_open_par (const char *path,int mode ,MPI_Comm comm, MPI_Info info,&ncid) “mode” must be either NC_MPIIO or NC_MPIPOSIX HDF and HDF-EOS Workshop XI, Landover, MD

  24. Parallel IO Programming Model Data writing : /* 1. Initialize MPI. */ MPI_Init(&argc,&argv) /* 2. Create a parallel netcdf-4 file. */ nc_create_par(FILE, NC_NETCDF4|NC_MPIIO, comm, info, &ncid) nc_var_par_access(ncid, v1id, NC_COLLECTIVE) /* 3. Write data. */ nc_put_vara_int(ncid, v1id, start, count,data) /*4. Close the file */ nc_close(ncid); /* 5. Shut down MPI. */ MPI_Finalize(); Data reading: Use nc_open_par instead of nc_create_par HDF and HDF-EOS Workshop XI, Landover, MD

  25. Other features • Datatype - More atomic datatype: unsigned integer(1,2,4 and 8 bytes) • Strings: replace character arrays • Enums,Opaque types • User-defined datatype • Fletcher32 checksum filter • UTF-8 support • Reader-Makes-Right conversion • Using HDF5 dimensional scale HDF and HDF-EOS Workshop XI, Landover, MD

  26. Content • History review • Overview of NetCDF4 features, builds and etc • Performance issues • Suggestions for users HDF and HDF-EOS Workshop XI, Landover, MD

  27. NetCDF4 Data Compression: Size <2 % HDF and HDF-EOS Workshop XI, Landover, MD

  28. NetCDF4 Data Compression: Data Write time HDF and HDF-EOS Workshop XI, Landover, MD

  29. NetCDF4 Data Compression: Data Read Time HDF and HDF-EOS Workshop XI, Landover, MD

  30. WRF Output in HDF5 -File Size HDF and HDF-EOS Workshop XI, Landover, MD

  31. WRF Output in HDF5- Data writing time HDF and HDF-EOS Workshop XI, Landover, MD

  32. EUMETNET OPERA Report in 2006 They evaluated the following data format: • FM 92 GRIB, NORDRAD, Universal Format, • netCDF, HDF4,HDF5, • XML and Scalable Vector Graphics (SVG), and GeoTIFF Their Recommendation: • Based on the results of the detailed evaluation, HDF5 is recommended for consideration as an official European standard format for weather radar data and products. Why? • Compared to other formats, HDF5’s compression algorithm (ZLIB) is more efficient… • A file format with efficient compression and platform independence is essential PyTables One of the beauties of PyTables is that it supports compression on tables and arrays HDF and HDF-EOS Workshop XI, Landover, MD

  33. Evaluation of Parallel NetCDF4 Performance • Regional Oceanographic Modeling System • History file writer in parallel NetCDF4(PnetCDF4) • History file writer in parallel NetCDF from Argonne(PnetCDF) • Data: • 60 1D-4D double-precision float and integer arrays

  34. PnetCDF4 and PnetCDF performance comparison PNetCDF collective NetCDF4 collective 160 140 120 100 Bandwidth (MB/S) 80 60 40 20 0 0 16 32 48 64 80 96 112 128 144 Number of processors • Fixed problem size = 995 MB • Performance of PnetCDF4 is close to PnetCDF

  35. ROMS Output with Parallel NetCDF4 • The IO performance gets improved as the file size increases. • It can provide decent I/O performance for big problem size.

  36. Chunking • Using chunking wisely • Review chunking tips for HDF5 HDF and HDF-EOS Workshop XI, Landover, MD

  37. Content • History review • Overview of NetCDF4 features, builds and etc • Performance issues • Suggestions for users HDF and HDF-EOS Workshop XI, Landover, MD

  38. NetCDF Classic Model HDF and HDF-EOS Workshop XI, Landover, MD

  39. Using the NetCDF Classic Model • NetCDF-4 files can be created with the CLASSIC_MODEL flag. This enforces the rules of the classic netCDF data model on this file. nc_create(FILE_NAME, NC_NETCDF4|NC_CLASSIC_MODEL, &ncid) • Once a classic model file, always a classic model file. This sticks with the file and there is no way to change in within the netCDF API. • Classic model files don't use any elements of the expansion of the data model in netCDF-4. They don't have groups, user-defined types, multiple unlimited dimensions, or the new atomic types. • Since they conform to the classic model, they can be read and understood by any existing netCDF software (as soon as that software upgrades to netCDF-4 and HDF5 1.8.0). • NetCDF-4 features which don't affect the data model are still available: compression, parallel I/O. HDF and HDF-EOS Workshop XI, Landover, MD

  40. HDF5 Features not in current NetCDF4.0 • No Scaleoffset, N-bit, szip filters (Plan for 4.1 release) • No supports for user-defined filters • Can only read HDF5 files having dimensional scales • Can only write data in chunking storage • No Fortran 90 APIs • No corresponding APIs for optimizations - cache, MPI-IO HDF and HDF-EOS Workshop XI, Landover, MD

  41. NetCDF 4.1 Plan • http://www.unidata.ucar.edu/software/netcdf/netcdf-4/req_4_1.html HDF and HDF-EOS Workshop XI, Landover, MD

  42. NetCDF4, HDF5 which one should I use? • Familiarity • Features • Performance • Compatibility • Release/feature lags Evaluate the followings: HDF and HDF-EOS Workshop XI, Landover, MD

  43. Based on stability of NetCDF4 Priority Recommendation HDF and HDF-EOS Workshop XI, Landover, MD

  44. More NetCDF4 information • Release and snapshot: http://www.unidata.ucar.edu/software/netcdf/netcdf-4/ • Tutorial in 2007 NetCDF workshop: http://www.unidata.ucar.edu/software/netcdf/workshops/2007/ • Paper in 2006 AMS annual meeting: http://www.unidata.ucar.edu/software/netcdf/papers/2006-ams.pdf HDF and HDF-EOS Workshop XI, Landover, MD

  45. Acknowledgements • Thanks Russ Rew and Ed Hartnett from Unidata for generously allowing me to use their slides and sharing their compression performance results in this workshop • Some contents that describe New features of are copied from 2007 Unidata NetCDF workshop • The Radar NetCDF data compression performance results are provided by Ed Hartnett at Unidata HDF and HDF-EOS Workshop XI, Landover, MD

More Related