1 / 46

The Paleobiology Database

The Paleobiology Database. A Hands-on Tutorial on Estimating Fossil Diversity Patterns. Wolfgang Kiessling, 25 September 2012. Program. 09:00 – 09:20 Computer-Hookup, Intro 09:20 – 10:00 Background and Rationale 10:00 – 10:45 Basic Features 10:45 – 11:00 Break

dinah
Download Presentation

The Paleobiology Database

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Paleobiology Database A Hands-on Tutorial on Estimating Fossil Diversity Patterns Wolfgang Kiessling, 25 September 2012

  2. Program • 09:00 – 09:20 Computer-Hookup, Intro • 09:20 – 10:00 Background and Rationale • 10:00 – 10:45 Basic Features • 10:45 – 11:00 Break • 11:00 – 11:30 Advanced Features • 11:30 – 12:30 Diversity Through Time • 12:30 – 13:30 Lunch • 13:30 – 14:00 Sampling-Standardized Diversity Curves with the PBDB • 14:00 – 15:00 Data Entry Trial

  3. Important Resources • Course Materials • http://download.naturkundemuseum-berlin.de/wolfgang.kiessling/Workshop • Database Servers • http://paleodb.org • http://paleodb.geology.wisc.edu/

  4. Background and Rationale • The Age of Biodiversity Informatics • Scope of modern biodiversity facilities • A brief history of the PaleoDB • The scientific question it sought to answer • The evolution since then

  5. The Age of Biodiversity Informatics • Biodiversity Informatics: An emerging discipline in the broader field of Bioinformatics aiming at information capture, storage, retrieval, and analysis of biodiversity data • The Age: Biodiversity research with increasing worldwide attention and funding especially for large-scale approaches

  6. Biodiversity Initiatives • National biodiversity centers being established worldwide, usually highly interdisciplinary • Science driven • Discovery/outreach driven • Policy driven • International consortia • Infrastructure: GBIF, OBIS • Policy: Intergovernmental Platform of Biodiversity & Ecosystem Services(http://www.ipbes.net) • Where is Paleo?

  7. GBIF and Allies • The Global Biodiversity Information Facility (GBIF) was founded in 2001 • Mission: facilitate free and open access to biodiversity data worldwide, via the Internet, to underpin sustainable development Priorities: • Mobilising biodiversity data • Developing protocols and standards • Building an informatics architecture www.gbif.org

  8. 271∙106 georeferenced data available GBIF promotes data-sharing with countries of origin.

  9. Use of GBIF data Predict biotic effects of climate change Analyse and predict spread of pests and diseases of humans, crops, livestock, wildlife, etc. Predict best places to set up new protected areas Analyse invasive species and predict invasion pathways Provide policy-maker-relevant data of all kinds Be a resource for biodiversity science communities

  10. Paleo to be Integrated at Multiple Scales • Short time scales: Natural baselines, ecological consequences of climate change  Conservation Palaeobiology • Long time scales: General principles of biodiversity regulation, response to extreme events  Analytical Palaeobiology

  11. The Paleobiology Database: A Core Infrastructure for the Biogeosciences • Founded in 2000, funded by NSF (2000-2005, 2010-) and other sources • Driven by a scientific question • Was the rise of marine biodiversity in the last 200 myr as dramatic as suggested by compendia of stratigraphic ranges? • Collect occurrence data, apply sampling standardization and use fossil data only http://paleodb.org

  12. Phanerozoic Marine Animal Diversity Exponential post-Paleozoic rise? Data from Sepkoski (2002, Bull. Am. Pal.)

  13. What is wrong with Sepkoski? • Data are just times of first and last appearances in the record (genera and families) • No way to standardize for sampling • Extreme effect of the Pull of the Recent

  14. New Logistic post-Triassic rise Alroy et al. (2008, Science) Marine Biodiversity Through Time Old Exponential post-Paleozoic rise Data from Sepkoski (2002, Bull. Am. Pal.)

  15. Structure of Compendia Corals and bivalves from Sepkoski‘s compendium of marine animal genera (2002)

  16. Evolution of the PaleoDB: New Horizons • Biogeographic Questions • Implementation of Scotese’s plate tectonic reconstructions • Extending taxonomic/environmental scope • Vertebrate, paleobotany, and micropaleontology research groups • Link to Neptune Database (Ocean Drilling) • Beyond Diversity • Communities over time • Environmental preferences • Geodisparity • Body-size distributions • Geological Drivers

  17. Basic Features of the PBDB • Organization • Structure • Finding data • Drawing maps • Downloading data

  18. Organization • Database Coordinator: John Alroy (Macquarie University) • Informal core group running mirror servers (3 persons) • Data Management Committee (10) • Data Contributors: Professional scientists (usually with PhD) (132) • Data Enterers: Contributors and students (310)

  19. The Structure • Basic information is the occurrence of a particular taxon (species, genus or higher) in a particular collection (i.e. sample or outcrop …) • References linked to occurrences and collections • Geographic and geologic context stored with each collection • Taxa classified according to multiple opinions (synonymies, re-identifications)

  20. Finding Data • Generate data summary tables • Menu: Analyze • Task: Marine Invertebrate Collections by Geological Period • Find collections • Menu: Full search – Fossil collection records • Task: Find all collections containing lithistid sponges (Lithistida) in Germany • Find taxa • Menu: Full search – Fossil organisms • Task: Get the full synonymy list of Brachiosaurus brancai

  21. Drawing Maps • Draw fossil collections on a plate tectonic reconstruction of the appropriate age • Menu: Analyze • Tasks: 1. Get a map of Jurassic reefs in a Mollweide projection. 2. Identify the westernmost reef and get a list of fossils

  22. Downloading Data • The most important step for further analyses • Virtually all data in the PaleoDB are open access • Downloads in csv format can be read by almost any program • Menu: Download • Task: Download all occurrences of Triassic sponges with coordinates/paleocoordinates, stage-level resolution and full taxonomic information

  23. Playtime + Break

  24. Advanced Features • Ecological metrics of collections • Diversity and others • Confidence intervals of stratigraphic ranges • Within sections and global • Diversity curve generator • Raw and sampling standardized

  25. Ecological Metrics • Get alpha diversity and ecological data from a collection • Menu: Analyze abundance data • Task: Get the metrics of a Triassic community from China (e.g. collection #31618) and look feeding modes

  26. Background of Diversity Metrics Which community is more diverse?

  27. Measuring Alpha Diversity • Shannon-Wiener Information Index (H) • H = -∑ pi x ln(pi) • pi= Proportion of the ith species in community • Mixed signal of richness and evenness • Evenness (J) • Evenness J = H/Hmax • Hmax = ln(S)

  28. Rarefaction • Which species richness would I observe if my sample A was smaller than it is (e.g., as small as sample B) • Mathematic solution: • Empirical solution: • Let the computer draw specimens at random and get diversity for a given sample size

  29. Confidence Intervals of Stratigraphic Ranges • The first and last observations of a taxon in the fossil record must be younger and older than its time or origination and extinction, respectively • By how much? • Quantifying uncertainties within sections and globally

  30. Draw a Stratigraphic Section • Menu: Analyze stratigraphic sections • Task: Try the Bangtoupo section in China

  31. Using the fossil record for molecular clocks • Calibration: Estimate the branching points of two sister groups • Menu: Analyze – Calculate a first appearance • Task: Branching point between Acropora and Montipora

  32. Diversity Through Time • Theoretical Background • Counting methods • Sampling issues • Sampling standardization • Hands on with R

  33. Counting Diversity Through Time A Through ranging B Through ranging Extinct C Originating D Singleton E

  34. Measuring Diversity A Through ranging B Extinct Originating C Singleton D Through ranging E Boundary crossers: 3 Range through: 5 Range through minus singletons: 4 Boundary crossers Range through

  35. Measuring Diversity Through Time Draw Diversity Curves: SIB, range through, range through minus singletons, boundary crossers

  36. 2 or 3 Perhaps 2 Sure 2 Rarefaction (3) 2,23 1,89 2 Sampling Standardization of Time Series Data This sufficient for sampled in bin (SIB) diversity, but silent on extinctions

  37. Diversity Over Time Omit Singletons S = 2, Ext = 0 S = 2, Ext = 0 S = 2, Ext = 2 S = 1, Ext = 0 S = 1, Ext = 0 S = 1, Ext = 1 S = 2, Ext = 0 S = 3, Ext = 1 S = 2, Ext = 2 S = 1.67, Ext = 0 S = 1.67, Ext = 0.33 S = 1.67, Ext = 1.67

  38. Subsampling Methods • Classical Rarefaction • Pool all occurrence data • Randomly draw data until quota is reached • Occurrences weigthed by-list subsampling (OW) • Pool occurrences by collections • Randomly draw collections until quota of occurrences is reached • Unweighted by-list subsampling (UW) • Pool collections • Randomly draw collections until quota of collections is reached • Occurrences-exponentiated weighted by-list subsampling (OexpW) • Pool occurrences by collections • Randomly draw collections until weighted quota of occurrences is reached • Shareholder Quorum Method • Sampling until a particular proportion (quorum) of the rank-abundance distribution has been sampled

  39. Why so many? • Rarefaction assumes that differences in diversity are due to sampling • We might lose biological signal by attempting to sampling-standardize if we don’t consider evenness • If evenness of communities is different, then rarefaction will mostly reflect these differences • The best subsampling method has to consider several biases

  40. Lunch

  41. Hands-On with the PaleoDB • Create a subsampled diversity curve with the online scripts • Download a dataset and use the function: Generate diversity curve data

  42. Analyze Downloaded Data with R • Open R • Run the script PBDB_analyze.R

  43. Occurrence Data Now and Then

  44. We Want You! The Paleobiology Database is from the community for the community Data quantity and quality need to be improved to increase rigor and scope of analyses Many important questions are yet to be addressed http://paleodb.org

  45. How to Enter Data • Give it a try • testpaleodb.geology.wisc.edu • Login as Contributor: • Authorizer: User60x, T. • Enterer: User60x, T. • Password: Berlin

  46. The Paleobiology Database (PaleoDB, www.paleodb.org) has been rapidly developing into a core infrastructure for palaeontology. The participation of 289 contributors from 22 countries made it possible that the PaleoDB now holds taxonomic and distributional information on 217,000 taxa and more than one million fossil occurrences. With 150 official publications, the scientific output is impressive, but could be improved if more colleagues would learn how to make use of the database for their own research.The purpose of this course is thus to familiarize paleontologists with the structure and scope of the PaleoDB and to introduce them to its analytical tools that are available online. Examples will be provided for paleo-community analysis, confidence intervals on stratigraphic ranges, and global and regional diversity patterns. Basic statistical concepts will be explained briefly, but the focus is on practicing with the database.

More Related