120 likes | 266 Views
Database interoperability. Aspen Lodge, September 26, 2007. The problem. Most collections use different data formats and search functions and everybody likes their own system Given value of cruise samples, need to report their acquisition and characteristics early
E N D
Database interoperability Aspen Lodge, September 26, 2007
The problem • Most collections use different data formats and search functions • and everybody likes their own system • Given value of cruise samples, need to report their acquisition and characteristics early • Also need a way to let NSF track data and publications derived from samples • Need complete registration of standard sample IDs and data formats (IGSNs)
A proposal: Step 1 • Establish system of Sea-going curators to accompany coring cost centers • Provides uniformity of sample description • Filled by sea-going core technicians and dredge handling staff or people trained by curatorial staff. • Use standard template with required fields to write preliminary metadata during sampling • There are relatively few sea-going curators so they can be trained in proper data entry and sample description • Write script to harvest cruise metadata into this template (lat/longs, cruise ids…)
Step 2Sample registration • Automate sample IGSN registration • Use a pre-assigned block of numbers or “Trusted Author” set up • If each repository has its own code, then the repositories should assign their own numbers and send the registered metadata to SESAR • If ship samples are taken, they get their own “daughter number”
Step 3: Once an internet link is available… • Automatically register basic sample information (location, sample type, cruise ID, IGSN) with NGDC from ship or shortly after cruise. • The community immediately can see each sampling event although an embargo may exist on other data for ~1 year (or agreed period) • If additional data are not forthcoming from the repository after the embargo, NGDC could sent the repository and PI a query asking for these data
Step 3At the repository… • Collections staff produce complete sample description • Associate data objects like photographs and physical property data sets with a sample object • Publish metadata and data objects to local repository in local format • Ideally use a web form that can write consolidated PDF or other format for easy downloads by users
Example of a metadata template for cores (left) and dredges (right_
Step 4Publish final metadata to NGDC • Develop translation tools for each US repository to write information on metadata and data objects to NGDC format • When data are published locally, NGDC is notified so they can harvest local data • NGDC also notified when there are new updates (say for sample history or new geochemical data) with new sample description information for samples that already have IGSN numbers