170 likes | 301 Views
Metadata Issues Underlying the Development of a Data Repository for Evolutionary Biology. Sarah Carrier, SILS, Master’s Student Jackson Dube, Visiting Scholar, SILS/MRC Jane Greenberg, Associate Professor, Director SILS/Metadata Research Center <MRC>, UNC-CH
E N D
Metadata Issues Underlying the Development of a Data Repositoryfor Evolutionary Biology Sarah Carrier, SILS, Master’s Student Jackson Dube, Visiting Scholar, SILS/MRC Jane Greenberg, Associate Professor, Director SILS/Metadata Research Center <MRC>, UNC-CH Ruth Monnig, Doctoral Research Assistant, SILS/MRC
Overview • Metadata defined • Role of metadata in a repository • Range of metadata standards • Issues • Discussion
The Knowledge Network for Biocomplexity (KNB) http://knb.ecoinformatics.org//data.html
Family: Pinaceae Species: Pinus serotina Date identified: 1958-05-10 County: Pasquotank County Location collected: Woodland Border, 2.3 miles north east of Nisonton Collected by: Harry E. Ahles <Species> Pinus serotina </Species> <Date.ID><scheme=SPEC.W3CDTF“>1958-05-10 <Date.ID> Metadata Example for a specimen
Metadata for a Water Quality Study <fileName ID='File1'>Jordan Lake Study</fileName> <varQnty>15</varQnty> <caseQnty>2000</caseQnty> <varGrp><labl>Study Procedure Information</labl></varGrp> <varGrp type=”subject”><txt> The following 15 variables were used to measure water quality over a two-year period.</txt></varGrp> <varGrp><defntn>Salinity is described as XXXXX</defntn></varGrp>
Metadata • Data about the content, quality, condition, and other characteristics of data (FGDC Glossary, 1992) • Additional information necessary for data to be useful (Musik, 1997) Resource = data = object = entity = document = data object
Why metadata? • Facilitate discovery • Permit use – intellectual and technical • Manage and preserve • Secure • Help advance the field of evolutionary biology
Range of published data objects • Table, graph • Dataset • Research methods / procedures • Bayesian inference of phylogeny • Meta-analysis • Computational biology
Metadata continuum Dublin Core FGDC/ CSGSM EML DDI
The Knowledge Network for Biocomplexity (KNB) *http://knb.ecoinformatics.org//data.html
The Knowledge Network for Biocomplexity (KNB) ontologies Data structures *http://knb.ecoinformatics.org//data.html
Issues • Cost • More metadata, more cost to produce • Less metadata, cost to users • Metadata creation • Who, when, how? • Incentivizing • Preservation, sustainability • Data object and associated metadata • Open access (“a loaded word”) • What levels of access/rights should be supported
Discussion Topics • Range of data objects • Granularity (metadata) • Users: Needs, greater use • Additional issues….
Metadata types and properties *Resource = data = object = entity = document = data object
Schemes (just a few…) LSID TEI Header; MARC bibliographic format, Dublin Core EAD FGDC/CSGSM; NBII EML DDI ODRL (Creative Commons Profile) A Core PREMIS Characteristics Objectives and principles Domains Environment Object type/format Architectural Layout Extent Level of Complexity Flat, hierarchical Granularity Range of metadata standards
Range of metadata standards • Data structure standards • Data communication standards • Data value standards • Content representation, ontologies, authority files • Data syntax standards • Data models, architectures/packaging