190 likes | 305 Views
Building the European Register of Marine Species. Richard White Biodiversity & Ecology Research Division, School of Biological Sciences, University of Southampton, UK Mark Costello & Chris Emblow Ecological Consultancy Services Ltd (EcoServe), Dublin, Ireland.
E N D
Building the European Register of Marine Species Richard WhiteBiodiversity & Ecology Research Division, School of Biological Sciences, University of Southampton, UK Mark Costello & Chris EmblowEcological Consultancy Services Ltd (EcoServe),Dublin, Ireland
European Register of Marine Species (ERMS) • Funded as a Concerted Action project by the MAST (Marine Science and Technology) programme of the European Union • Managed by EcoServe in Dublin • Team of participants including “list editors” • http://erms.biol.soton.ac.uk
ERMS versus URMO • URMO is creating global lists of marine organisms but is not taxonomically complete • ERMS is creating a regional list (for European waters) but it is (almost) taxonomically complete • With the Fauna Europaea and Euro+Med PlantBase projects, Europe will have a complete list of its species (almost)
How many species? • About 29,500
Incoming data • Approximately 100 separate lists for different taxonomic groups • Mostly compiled as spreadsheets • Scientific names, synonyms, geography (at least Atlantic or Mediterranean) • Some optional fields
List conversion is carried out in several stages: • Excel spreadsheets are exported to text files • Tab-delimited text files are converted to “holding format” (was XDF, now a client-server database (MySQL) • Database queries results are passed through templates to generate either RTF (for the printed publication) or HTML (for the Web site)
Variations on a theme • Fields may be combined or separated e.g. genus species authority date • Higher taxa may be: • repeated in fields of the species record • given once in separate preceding records in various different formats • Synonyms may be: • in a separate field of the species record, or mixed with other remarks, with various delimiters and separators • in separate records, linked by code or by name or even abbreviated • implied, e.g. Genus1 specname (Smith as Genus2) • Geographical information is often free text
Conversion: simple case #!/usr/bin/perl -w # Porifera.pl: convert an ERMS list text file to an XDF file use PerlStart; use ERMS; &speciesList(); __END__ list code PF list version 1 list rank phylum record 1 fields field 1 genus field 2 species field 3 species authority and date field 4 used
More complicated case #!/usr/bin/perl -w # Tardigrada.pl: convert an ERMS list text file to an XDF file use PerlStart; use ERMS; &speciesList ( sub { &extractSynonyms(10, "syn.:"); } ); __END__ list code TG list version 2 list rank phylum record 1 title record 2 fields field 1 order field 2 family field 3 genus field 4 species field 5 subspecies field 6 species authority field 7 species date field 8 geography field 9 reference field 10 remarks
“Holding format” XDF file (HIGHER:informal:Tetrapoda) (HIGHER:order:Testudines) (HIGHER:family:Cheloniidae) TP00001:Caretta:caretta:(Linnaeus, 1758):species::::::::Cosmopolitan warm to temperate waters::::Loggerhead turtle:::: TP00002:Chelonia:mydas:(Linnaeus, 1758):species::::::::Cosmopolitan warm water::::Green turtle:::: TP00003:Eretmochelys:imbricata:(Linnaeus, 1766):species::::::::Cosmopolitan warm water::::Hawksbill turtle::::
Example RTF file for the book Order Isopoda Suborder Anthuridea Family Antheluridae Ananthura abyssorum (Norman & Stebbing, 1886) A Anthelura elongata Norman & Stebbing, 1886 A ovalis (Barnard, 1925) M = Ananthura ovalis sulcaticauda (Barnard, 1925) A = Ananthura sulcaticauda truncata (Hansen, 1916)
Static versus dynamic web pages • Initial web pages were generated statically (in advance) from the XDF “holding format” (without synonyms) • RTF files were generated from the database (with synonyms) • Future web pages will be generated dynamically (on demand) from the database (with synonyms)
Database schema (simplified) Taxon file: Name table: taxon ID (PK) name ID (PK) geography taxon ID (I, FK) etc. Genus (I, FK) species (I) Hierarchy table:author taxon (PK) etc. rank parent (I, FK) etc.