290 likes | 380 Views
The challenge of biodiversity: Plot, organism, and taxonomic databases Robert K. Peet University of North Carolina The National Plots Database Committee John Harris NCEAS. A case study: VegBank - The ESA Vegetation Plot Archive. Project organized and directed by:
E N D
The challenge of biodiversity: • Plot, organism, and taxonomic databases • Robert K. Peet • University of North Carolina • The National Plots Database Committee • John Harris • NCEAS
A case study: VegBank - The ESA Vegetation Plot Archive Project organized and directed by: Robert K. Peet, University of North Carolina Marilyn Walker, USDA Forest Service & U. Alaska Dennis Grossman, The Nature Conservancy / ABI Michael Jennings, USGS-BRD & UCSB Project supported by: National Center for Ecological Analysis & Synthesis U.S. National Science Foundation USGS-BRD Gap Analysis Program ABI / The Nature Conservancy
Biodiversity data structure Locality Observation/Collection Event Plot/Inventory databases Object or specimen Specimen databases Taxon Taxonomic databases
Web-interface Information flow in the US National Vegetation Classification Veg Classification Database Proposal Taxonomic Database VegBank Raw Plot Data Proposal Vegetation/Biodiversity
Taxonomic database challenge The problem: Integration of data potentially representing different times, places, investigators and taxonomic standards The traditional solution: A standard list of kinds of organisms.
There exist numerous compilations of organism names. For example: • Species 2000 http://www.sp2000.org/default.html(Composed of 18 participant databases) • All Species http://www.all-species.org • ITIS http://www.itis.usda.gov/(The US government standard list, plus Canada & Mexico) • Index to organism nameshttp://www.biosis.org.uk/triton/indexfm.htm
Taxon-specific standard lists are available. Representative examples for higher plants include:North America / US USDA Plants http://plants.usda.gov/ ITIS http://www.itis.usda.gov/ NatureServe http://www.natureserve.org World IPNI International Plant Names Checklist http://www.ipni.org/ IOPI Global Plant Checklisthttp://www.bgbm.fu-berlin.de/IOPI/GPC/
Most standardized taxon lists fail to allow effective integration of datasets. • The reasons include: • The user cannot reconstruct the database as viewed at an arbitrary time in the past, • Taxonomic concepts are not defined (just lists), • Multiple party perspectives on taxonomic concepts and names cannot be supported or reconciled.
Current standards • Biological organisms are named following international rules of nomenclature. • Database standards are being developed by TDWG, GBIF, IOPI, etc. • Metadata standards have been developed. For example, the Darwin Core is a profile describing the minimum set of standards for search and retrieval of natural history collections and observation databases. (http://tsadev.speciesanalyst.net/DarwinCore/)
Three concepts of shagbark hickory Splitting one species into two illustrates the ambiguity often associated with scientific names. If you encounter the name “Carya ovata (Miller) K. Koch” in a database, you cannot be sure which of two meanings applies. Carya carolinae-sept. (Ashe) Engler & Graebner Carya ovata (Miller)K. Koch Carya ovata (Miller)K. Koch sec. Gleason 1952 sec. Radford et al. 1968
Multiple concepts of Rhynchospora plumosa s.l. Elliot 1816 Gray 1834 Chapman 1860 Kral 1998 Peet 2002? R. plumosa R. plumosa v. plumosa R. plumosa R. sp. 1 1 R. plumosa v. plumosa R. plumosa R plumosa v. intermedia R. intermedia 2 R. plumosa v. interrupta R. pineticola R. plumosa v. pineticola 3
An assertion represents a unique combination of a name and a reference “Assertion” is equivalent to “Potential taxon” & “taxonomic concept” Name Assertion Reference
Six shagbark hickory assertions Possible taxonomic synonyms are listed together Assertions (One shagbark)C. ovata sec Gleason ’52 C. ovata (sl) sec FNA ‘97 (Southern shagbark)C. carolinae-s. sec Radford ‘68C. ovata v. australis sec FNA ‘97 (Northern shagbark) C. ovata sec Radford ‘68 C. ovata (v. ovata) sec FNA ‘97 Names Carya ovata Carya carolinae-septentrionalis Carya ovata v. australis References Gleason 1952 Britton & Brown Radford et al. 1968 Flora Carolinas Stone 1997 Flora North America
A usage represents a unique combination of an assertion and a name. Usages can be used to track nomenclatural synonyms Name Usage Assertion
ITIS Usage Assertions Names 1. Carya ovata 2. C. carolinae 3. C. ovata var. australis 1-F OK 2-D OK 3-D Syn • ovata sec. Gleason • ovata sl sec. FNA • carolinae sec. Radford • ovata australis sec. FNA • ovata sec. Radford • ovata ovata sec. FNA ITIS likely views the linkage of the assertion “Carya ovata var. australis sec. FNA 1997” with the name “Carya ovata var. australis” as a nomenclatural synonym.
A usage (name assignment) and assertion (taxon concept) can be combined in a single model Name Usage Assertion Reference
Party Perspective • The Party Perspective on an Assertion includes: • Status – Standard, Nonstandard, Undetermined • Correlation with other assertions – Equal, Greater, Lesser, Overlap, Undetermined. • Lineage – Predecessor and Successor assertions. • Start & Stop dates.
Party Assertion ITIS FNA CommitteeNatureServe USDA Plants Carya ovatasec Gleason 1952 Carya ovata (sl) sec FNA 1997 Carya ovata sec Radford 1968 Carya carolinae sec Radford 1968 Carya ovata (ovata) sec FNA 1997 Carya ovata australissec FNA 1997 Status Party Assertion Status Start Name ITIS ovata – G52 NS 1996 ITIS ovata – R68 St 1996 ovata ITIS carolinae – R68 St 1996 carolinae ITIS carolinae – R68 NS 2000 ITIS ovata aust – FNA St 2000 carolinae ITIS ovata – R68 NS 2000 ITIS ovata ovata – FNA St 2000 ovata
Concept-based taxonomy is coming! • All organisms/specimens in databases should be identified by linkage to an assertion = name and reference! • Various standards are being developed by FGDC, TDWG, IOPI, GBIF, etc. • Most major databases are working toward inclusion of assertions (e.g. ITIS, IOPI, HDMS). • Until standard assertion lists are available, databases that track organisms should include couplets containing both a scientific name and a reference.
(Inter)National Taxonomic Database? • Concept-based • Party-neutral • Synonymy and lineage tracking • Perfectly archived An upgrade for ITIS & Species 2000?
Specimen/object/occurrence databases • Information on specimens/objects/occurrence-observations should be tracked by reference to • Place (place or collection) • Unique identifier (accession number) • Time • A museum is a place • Annotation should be by assertion (concept)!
Database systems for tracking specimens • The following are a few of the many available • BioLink http://www.ento.csiro.au/biolink/index.html • Specify http://usobi.org/specify/default.htm • Biota http://viceroy.eeb.uconn.edu/Biota • Taxis http://taxis.virtualave.net/ • TDWG maintains links to multiple software systems • http://www.bgbm.fu-berlin.de/TDWG/acc/Software.htm
Plots Database Systems • Several plot database systems are available. Among the best know and widely used are: • TurboVeg (over 1,000,000 plots stored) http://www.alterra.nl/onderzoek/producten/websites/turboveg/ • Plots (NatureServe NPS Mapping Project)
A vegetation plot archive? • There is currently no standard repository for plot data. • A repository is needed for: • Plot storage • Plot access and identification • Plot documentation in literature/databases • This would be equivalent to GenBank for vegetation.
Core elements of VegBank Project Plot Plot Observation Taxon Observation Taxon Interpretation Plot Interpretation
Interface tools • Desktop client for data preparation and local use. • Flexible data inport, including XML. • Tools for linking taxonomic and community concepts. • Standard query, flexible query, SQL query. • Flexible data export, including XML. • Local data refresh • Easy web access to central archive
Conclusions for database designers • Records of organisms should always contain (or point to) couplets consisting of a scientific name and a reference where the name was used. • Design for future annotation of organism and community concepts. • Track specimens/objects by unique identifier with metadata including location, annotation & time. • Design for reobservation. Separate permanent from transient attributes. • Archival databases should support time-specific views.
Infrastructural needs • A national or international database of taxon concepts with support for at least one (ITIS?) party perspective. • Software tools for development and documentation of taxonomic concepts (including irregular concepts) and party perspectives. • Completion and long-term support for a national or international archive for vegetation plot data (VegBank) and similar community observations. • A national or international database of community concepts with support for at least the FGDC party perspective.