1 / 12

Report to ESDSWG MEaSUREs : Best Practices 10/22/10

Report to ESDSWG MEaSUREs : Best Practices 10/22/10. Goddard Earth Sciences Data Information and Services Center (GES DISC). Bruce Vollmer, NASA GSFC Dana Ostrenga, Adnet Systems Inc Steve Kempler , NASA GSFC. MEaSUREs Datasets at the GES DISC.

zahur
Download Presentation

Report to ESDSWG MEaSUREs : Best Practices 10/22/10

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Report to ESDSWG MEaSUREs: Best Practices10/22/10 Goddard Earth Sciences Data Information and Services Center (GES DISC) Bruce Vollmer, NASA GSFC Dana Ostrenga, Adnet Systems Inc Steve Kempler, NASA GSFC

  2. MEaSUREs Datasets at the GES DISC

  3. http://disc.sci.gsfc.nasa.gov/measures (still under construction)

  4. How many months into the MEaSUREsproject were first contacts with Data Centers made – was it good enough? • MEaSUREs Projects started early to mid 2008. GES DISC first made contact in mid 2009 as part of the GES DISC MEaSUREs Working Group in conjunction with GES DISC User Working Group Meeting. • Work with MEaSUREsPrincipal Investigators (PIs) began in earnest in early 2010 (resources made available). Positive:This timeline is adequate in terms of gathering information and preparing to archive the products from the 7 projects assigned to the GES DISC. The projects are at different phases thus they all don’t require the same intensive work at the same time. Negative: Getting started when the projects began their data definitions would have allowed the GES DISC to provide effective recommendations on file formats and metadata (items that contribute to the interoperability of the products) prior to some projects already generating products. We are ‘catching up’! But, also, some rework may be required, possibly associated with reprocessing by the PI, to implement the recommendations.

  5. What products currently handled by Data Center are similar to products from MEaSUREs? • Chung-Lin Shies' products (surface flux) is an updated version of products currently available from GES DISC PDISC and TOVAS visualization instance. The MEaSUREsproducts are available in HDF-EOS5 format versus the previous binary format allowing more usability to a broader community. • LucieneFroidevoux is working with AURA MLS and GEOS-5 data available from GMAO. Both these data sets are archived at the GES DISC and have staff that frequently work with the data. The data will be available in netCDF because the modeling community is the predominant user community. • RichardMcPeters is expanding the total ozone merged data record by integrating data from 3 new instruments and various instruments such as OMI and AURA MLS for an ozone vertical distribution set . The GES DISC is familiar with the original core products TOMS/SBUV and SeaWIFSas well archives the OMI and MLS data sets. • Christina Hsu’s project applies the Deep Blue algorithm to measurements from SeaWIFS and runs in the MODIS processing. Some of the current GES DISC staff are part of Christina’s teamin archiving and providing the data in a Giovanni visualization instance. This data is already in a recommended format and includes most of the recommended metadata. • Jay Herman is developing an expanded UV reflectivity data set utilizing data from TOMS, SBUV, SBUV-2, OMI and SeaWiFS. Some of this data either is or has been achived at the GES DISC and has staff who work with this data. • Eric Fetzer will focus on water vapor and temperature from data from the A-Train satellite constellation. This data is either archived here at the GES DISC or is provided through the Giovanni A-Train Data Depot. • Eric Wood is working with model and satellite data, including TRMM and GPCP which are provided to the public through PDISC and Giovanni at the GES DISC. The GES DISC also has other projects such as HDISC and YOTC that have provided familiarity with the model, CMORPH and AMSR-E data.

  6. What products currently handled by the Data Center are likely to be used along with those from MEaSUREs? What is the implication on interoperability from a user’s point of view? • GES DISC resident products discussed on the previous slide are most likely to be used with the those products generated by MEaSUREs • The data center goal is to provide datasets that can be used together and/or combined for multi-dataset studies. Some ways this is achieved include: • Providing to PIs guidelines upfront, that facilitate interoperability • Encouraging PIs to work with their respective data center (The response to this has been positive) • Providing specific recommendations for utilizing common formats and metadata • Providing recommendations that facilitate the use of web services and visualization services • Providing support to standardize as appropriate • Providing data read tools • Providing tools that translate/convert between ‘standards’ (i.e. data formats)

  7. Have data formats been chosen for MEaSUREs products? What approach was (or is being) used to make the selection? • The recommended data format is HDF-EOS5 and NetCDF4 • GES DISC staff has performed extensive research on the usability and interoperability of the data formats with various software readers, visualizers and archive standards. • These formats are also known and accepted formats amongst the various communities • Majority of the ozone, aerosol, model, ocean and precipitation data handled by the GES DISC are in the HDF format as opposed to initial PI formats such as: grib, binary, ascii etc.

  8. What metadata standards have been agreed upon? How are search, access and utilization of data being facilitated by metadata? • A working formal metadata and format recommendations document is being drafted and distributed to the MEaSUREs PIs. The document is based on the Climate Forecast (CF) conventions v1.4 with analysis of ISO 19115 underway to facilitate long term usage. • Staff is currently looking into how CF metadata recommendations can be fully implemented into the HDF-EOS5 format. • Consistent and thorough metadata will create a more accurate catalog of the data that will be utilized by the search engines behind the archive. • Also, extensive metadata that is consistent with standards make the data more accessible and readable by various software and web services, facilitating increased usage by the user community and promote inter-disciplinary research.

  9. What are approaches to data provenance? • Information on provenance is being requested from PIs on full data lineage for their products. If full data lineage cannot be provided, minimal upper level information is requested to be added to the metadata at a collection level to provide the user with as much information as possible. This information is taken from past attempts of data provenance and published recommendations. These recommendations are provided to the PI in the recommended metadata and format documentation drafted by the GES DISC. • Some PIs (e.g., Fetzer) will be providing requirements for data provenance capture • The GES DISC and some PIs are looking to ESIP Fed and ESDSWG Working Groups for recommendations on common approach(s) to implementing this information in metadata and documentation.

  10. Are there formal agreements between MEaSUREs projects and the respective Data Centers? • A template in the form of an Interface Control Document (ICD) was drafted to formalize the exchange of data and information between each MEaSUREs PI assigned to the GES DISC and between the GES DISC Point of Contact assigned to that project. • The ICD is to be filled out as a collaborative effort between both parties and is considered a working document as requirements may change as data is developed.

More Related