170 likes | 179 Views
Learn about new tools and techniques that facilitate web-based collaboration with large data sets, such as those produced by the ocean model ROMS. Discover how OpenDAP and CF Conventions can simplify data extraction, analysis, and visualization, allowing multiple users to interactively manipulate the data.
E N D
Collaboration Tools and Techniques for ROMS Rich Signell,USGS Woods Hole, MA
Abstract • Collaboration Tools and Techniques for Large Model Data Sets Rich SignellU.S. Geological SurveyWoods Hole, MA USA • New tools and standards are emerging that facilitate web-based collaboration with large data sets such as those produced by the ocean model ROMS. Using OpenDAP (a.k.a. DODS), ROMS NetCDF output files can be placed on a web server and users can extract just the data they need (say, the surface temperature from a particular day) from the file without any extra effort by the modeller. This, for example, allows a collaborator to issue a simple command in Matlab that will load just the model output desired from the remote web site into a local Matlab session, avoiding file format conversion and wasting network bandwidth. If in addition the ROMS NetCDF files are modified to follow the CF Conventions, a set of conventions specifically designed for complex model output (including handling of the ROMS s-coordinate), then public domain software such as Unidata’s Integrated Data Viewer (IDV) will recognize the ROMS output files, and can be used to interactively browse, analyze and visualize the results in 3D. Multiple web users can visualize and manipulate the data interactively through the collaboration facility built into IDV. The conversion to CF-compliant NetCDF can be achieved easily using the NetCDF operator tools (NCO). The NCO tools can also be used to automatically reduce the ROMS output files by a factor of 2 by converting floats to short integers, which have sufficient dynamic range for most variables. This also doubles the speed at which Internet users can obtain their requested data. If the model data provider takes a small additional step of creating a THREDDS catalog (a straightforward XML file) of the CF compliant ROMS output files, then the model results appear as just another data source to an IDV user. This allows users to browse and create visualization using model results without knowing that they are using NetCDF.
What’s the Problem? • Typical model outputs are 100 Mb up to several GB. • Traditional collaboration method: users grab the whole NetCDF file from your web/ftp site, or you e-mail them a few images. • There has to be a better way…
DODS/OpenDAP • Putting the “Net” in NetCDF! • DODS allows efficient slicing from data via the web, just as NetCDF works for local files. • DODS serves not just NetCDF, but also Matlab, HDF
DODS/OpenDAP • Serving DODS data requires almost no effort on the part of the data provider: • Download DODS server binaries to the cgi-bin directory on the web server • Put your NetCDF files on the web server • Go have a coffee to celebrate ! (Note: most people don’t know that getting a DODS server going is this easy!)
Accessing DODS data • DODS APIs (C++, Java) • Any NetCDF tool, relinked instead with DODS netCDF library • ncdump => dncdump • ncview => dncview
DODS & Matlab • DODS GUI and command line tools • Relinked mexcdf53.dll, which can enable all Matlab tools that read NetCDF! • (e.g.) NetCDF/Matlab toolbox • >> url=‘http://longpath/myfile.nc’ • >> nc=netcdf(url); • >> lon=nc{‘lon’}(:);
DODS Success Story • DODS at sea: in limited bandwidth situation, grabbed only 200 k OBC region instead of 18 Mb NetCDF file. • 30 second download instead of 45 minutes!
Need for Conventions • One of the greatest things about NetCDF is that it places few demands on the data provider - they are free to specify whatever attributes they want, or none at all • This is also one of the worst things, making it hard to develop flexible software • Software for ROMS won’t work for POM, NCOM, HOPS, ECOM, etc (and vice versa)
Making ROMS CF-compliant • Store all information about the grid (lon_u, lat_u, angle) in the .his and .avg files (not just the grid file) • Add “coordinates” attributes to curvilinear variables (e.g. zeta:coordinates=“lat_rho lon_rho) • Add “standard_name=ocean_s_coordinate” • Make sure dimension names match coordinate variable names (ocean_time, sc_r) • Units need to be recognized by UDUNITS
ROMS2CF script CF checker: http://titania.badc.rl.ac.uk/cgi-bin/cf-checker.pl #!/bin/bash GFILE='../adria02_grid2.nc' FFILE='adria03_avg.nc' ncks -F -d ocean_time,1 $FFILE ${FFILE}_CF # Specify horizontal coordinate variables associated with "RHO fields" ncatted -O -h -a "coordinates","temp",c,c,"lat_rho lon_rho" ${FFILE}_CF ncatted -O -h -a "coordinates","salt",c,c,"lat_rho lon_rho" ${FFILE}_CF # Specify horizontal coordinate variables associated with "U fields" ncatted -O -h -a "coordinates","u",c,c,"lat_u lon_u" ${FFILE}_CF ncatted -O -h -a "coordinates","ubar",c,c,"lat_u lon_u" ${FFILE}_CF # Merge the ROMS grid file into the CF file so we # have all the coordinate variables we need ncks -O -v lon_rho,lat_rho,lon_u,lat_u,lon_v,lat_v,mask_rho,mask_u,mask_v,angle $GFILE $GFILE.tmp ncks -A $GFILE.tmp ${FFILE}_CF rm $GFILE.tmp # Add vertical coordinate info ncatted -O -h -a "standard_name","sc_r",c,c,"ocean_s_coordinate" ${FFILE}_CF ncatted -O -h -a "positive","sc_r",c,c,"up" ${FFILE}_CF ncatted -O -h -a "formula_terms","sc_r",c,c,"s: sc_r eta: zeta depth: h a: theta_s b: theta_b depth_c: hc" ${FFILE}_CF # Add data from field file to template ncks -A $FFILE ${FFILE}_CF # rename the dimension ncrename -O -h -d s_rho,sc_r ${FFILE}_CF
Integrated Data Viewer (IDV) • Works on local CF-compliant NetCDF files • Works on THREDDS catalog data • THREDDS is just XML that tells IDV what type of server is being used… • …so if you make a THREDDS catalog for your DODS data, IDV can access it through the web.