100 likes | 224 Views
GHRSST Aggregations using NcML. Upendra Dadi. GHRSST Overview. Aggregation Process. http://data.nodc.noaa.gov/opendap/. http://dods.jpl.nasa.gov/opendap/. ghrsst/. ghrsst/. L4/. L4/. /L2P_Gridded. /L2P_Gridded. L2/. L2/. ghrsst_combined.xml. L4/. /L2P_Gridded. L2/.
E N D
GHRSST Aggregations using NcML Upendra Dadi
Aggregation Process http://data.nodc.noaa.gov/opendap/ http://dods.jpl.nasa.gov/opendap/ ghrsst/ ghrsst/ L4/ L4/ /L2P_Gridded /L2P_Gridded L2/ L2/ ghrsst_combined.xml L4/ /L2P_Gridded L2/ (L3 will be addedin GDS v2) Time
Time granularity of originator data is not necessarily same as the granularity required for a data analysis task. NcML aggregations could help here. Ideal for climate related studies. Lessons Learned (not to any scale) hourly daily weekly monthly seasonal annual decadal centurial mellinial
Anyone with access to web could create the aggregations, one doesn't have to be inside NODC. Aggregations created by one user could be used by others. Having a shared repository of NcML files could be useful.
Performance is the biggest short coming. Large amount of time spent on decompressing the data. NetCDF-4 could help. Tools like nccopy are useful to the end user. Having tools to update the local physical version of the dataset when the NcML changes would be useful. Running time for retrieving time series at a point for a two month period for an L4 product is 90 sec Repeating the same query for another point but for the same time period took 2 sec
Issues with caching. It would be useful to have elements in the NcML to update the individual NcMLs in the cache periodically instead of entire cache.
Several interesting possibilities. Allows integration of data from heterogeneous sources over web to create virtual datasets. Datasets from different disciplines could be integrated. Ability to represent vector data using netCDF would make such integration more attractive to mainstream GIS users.
NODC has lot of in-situ(observational) data. Ability to aggregate not just 2d arrays but also individual profiles & trajectories into multi-profiles and multi-trajectories would be very useful. time time
Similar to ETL tools used in Data Warehousing. Equivalents in Relational World, but the data is more complex than most relational databases can handle.