1 / 19

Data Models for Ecological Databases

Explore various DBMS types such as File System-Based, Hierarchical, Network, Relational, and Object-Oriented for ecological databases. Learn about project datasets, network database projects, and object data structures. Understand the importance of data modeling and normalization in optimizing database efficiency. Discover the challenges faced in creating a perfect data model for ecological data.

ricej
Download Presentation

Data Models for Ecological Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Models for Ecological Databases John Porter Department of Environmental Sciences University of Virginia

  2. File system-based Hierarchical Network Relational Object-oriented You’ve seen these before, now lets go into more detail DBMS Types

  3. File-System Based Directory Files Files Files • very simple and easy to set up • inefficient • few capabilities

  4. Project Datasets Investigators Variables Locations Codes Methods Hierarchical • Hierarchical • efficient • not very general • e.g. phylogenetic structures • geographical images

  5. Network Database Projects Links are hard-coded into database. They are not a property of the data Datasets Locations • very flexible • unwieldy to modify • not widely used

  6. Projects Location_id Data_id Datasets Locations Location_id Relational Database Linkages are through the properties of the data itself - not hard coded • widely-used, mature • table-oriented • restricted range of structures

  7. Methods Object Data Structure Object Oriented • developing -few commercial implementations • diverse structures • extensible Complex data structures, along with the methods to use the data are in the database

  8. Data Modeling • DBMS Systems are highly flexible • Good: they can do a lot! • Bad: they have to be told how to do it! • A Database Management System is the CANVAS, the DATA MODEL is the painting…….

  9. Data Modeling • Data modeling is used to develop the database structures used in a database • Your data model effects • reliability of the data • efficiency and speed of queries • the complexity of the database • Data modeling is an art, not a science!

  10. Some Terminology: Tables contain attributes or fields (columns) and multiple observations or tuples (rows)

  11. Species Observation Genus Species Observer CommonName Date Flat-file Tables in boxes Attributes in ovals

  12. Normalization • One widely-used approach for reducing errors within a database is to normalize your data structures • Normalization is the process of eliminating duplicate or redundant information

  13. Spec_code Spec_code Observation Species Genus CommonName Species Date Observer Two-table Relational Database

  14. Species Observations Specimens Images Locations Observers Internet Links Complex Data Model Notation: One-to-one One-to-many or 

  15. Personnel Projects Mailing Lists Dataset Dataset Locations Variable Variable Codes Data Model for Metadata at theVCR/LTER Optional Linkage Mandatory Linkage

  16. “Beanstalk”& “String of Pearls” • Metadata • methods • units • Location Table • Lat/Lon

  17. Beanstalk / String of Pearls • Highly normalized • Extremely flexible - capable of handling many different kinds of data • Inefficient • Queries can be very slow • Can require large amounts of space

  18. Why is there no perfect data model for ecological data? • One of the reasons data modeling is an ART not a SCIENCE is that ecologists use data in many different ways • Data that is perfectly formed for one kind of analysis may be unusable for another • Different analytical software may be used

  19. Why No Perfect Model? • Generally ecologists want to use data in “flat file” formats that combine all the tables containing data into a single, denormalized “spreadsheet”-type format- but even that format can vary between researchers • ClimDB needed to support single parameter and multiple parameter formats to meet researcher needs

More Related