1 / 52

Introduction to Ecoinformatics: Past, Present & Future

Introduction to Ecoinformatics: Past, Present & Future. William Michener LTER Network Office, University of New Mexico January 2007. Outline. Ecoinformatics: a definition A science vision Information challenges Ecoinformatics “solutions”. Outline. Ecoinformatics: a definition

kalb
Download Presentation

Introduction to Ecoinformatics: Past, Present & Future

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Ecoinformatics: Past, Present & Future William Michener LTER Network Office, University of New Mexico January 2007

  2. Outline • Ecoinformatics: a definition • A science vision • Information challenges • Ecoinformatics “solutions”

  3. Outline • Ecoinformatics: a definition • A science vision • Information challenges • Ecoinformatics “solutions”

  4. Ecoinformatics A broad S&T discipline that incorporates both concepts and practicaltools for the understanding, generation, processing, and propagation of ecological data, information and knowledge.

  5. Outline • Ecoinformatics: a definition • A science vision • Information challenges • Ecoinformatics “solutions”

  6. Many studies employ a restricted scale of observation -- Commonly 1 m2 The literature is biased toward single and small scale results

  7. Thinking Outside the “Box” LTER Time Biocomplexity Parameters Space NEON, WATERS, OOI, …. Increase in breadth and depth of understanding.....

  8. Grand environmental challenges 2000 1998 2001 2003 2004 2004

  9. More and more of the ecological questions that confront society are national, continental and global in scope Source: CDC Drought Source: Drought Monitor

  10. 26 NSF LTER Sites in the U.S. and the Antarctic: > 1,600 Scientists; 6,000+ Data Sets—different themes, methods, units, structure, …. LTER

  11. NEON Climate Domains 1 1 2 2 3 3 4 4 16 15 14 20 11 12 11 10 17 13 16 10 18 16 14 13 17 12 18 20 19 15 19 9 9 8 8 7 7 6 6 5 5

  12. BioMesonet Tower and Sensor Arrays Aquatic Arrays

  13. Micron-scale nitrate ISE Soil Sensor Arrays

  14. Small-Organism Tracking: Mobile animals as bio-sentinels for environmental change, forecasting biological invasions, emerging disease spread

  15. Outline • Ecoinformatics: a definition • A science vision • Information challenges • Ecoinformatics “solutions”

  16. Satellite Images High GIS Weather Stations Most Ecological Data Data Volume (per dataset) Business Data Biodiversity Surveys Primary Productivity Most Software Population Data Gene Sequences Soil Cores Low High Complexity/Metadata Requirements Characteristics of Ecological Data

  17. Data Entropy Time of publication Specific details General details Retirement or career change Information Content Accident Death Time (Michener et al. 1997)

  18. Semantics A B B • Schema transform • Coding transform • Taxon Lookup • Semantic transform • Imagine scaling!! C C

  19. Semantics—Linking Taxonomic Semantics to Ecological Data Elliot 1816 R. plumosa • Taxon concepts change over time (and space) • Multiple competing concepts coexist • Names are re-used for multiple concepts Gray 1834 R. plumosa Rhynchospora plumosa s.l. R. Plumosa v. intermedia R. plumosa v. plumosa Chapman 1860 R. Plumosa v. interrupta R. intermedia Kral 1998 R. pineticola R. plumosa Peet 2002? R. plumosa v. pinetcola R. plumosa v. plumosa R. sp. 1 A B C from R. Peet

  20. What Users Really Want…

  21. Outline • Ecoinformatics: a definition • A science vision • Information challenges • Ecoinformatics “solutions”

  22. Experimental Design Methods Data Design Data Forms Field Computer Entry Electronically Interfaced Field Equipment Electronically Interfaced Lab Equipment Research Program Investigators Studies Quality Control Raw Data File Data Entry Quality Assurance Checks Data verified? no • Standard Operating Procedures • Policies • Data sharing • Computer use • Archive storage Data Contamination yes Summary Analyses Archive Data File Metadata Data Validated Archival Mass Storage Off-site Storage Magnetic Tape / Optical Disk / Printouts Investigators Access Interface Secondary Users Publication Synthesis

  23. Ecoinformatics solutions • Data design • Data acquisition • QA/QC • Data documentation (metadata) • Data archival

  24. Ecoinformatics solutions • Data design • Data acquisition • QA/QC • Data documentation (metadata) • Data archival

  25. Data Design • Conceptualize and implement a logical structure within and among data sets that will facilitate data acquisition, entry, storage, retrieval and manipulation.

  26. Database Types • File-system based • Hierarchical • Relational • Object-oriented • Hybrid (e.g., combination of relational and object-oriented schema) Porter 2000

  27. Data Design: 7 Best Practices • Assign descriptive file names • Use consistent and stable file formats • Define the parameters • Use consistent data organization • Perform basic quality assurance • Assign descriptive data set titles • Provide documentation (metadata) from Cook et al. 2000

  28. 1. Assign descriptive file names • File names should be unique and reflect the file contents • Bad file names • Mydata • 2001_data • A better file name • Sevilleta_LTER_NM_2001_NPP.asc • Sevilleta_LTER is the project name • NM is the state abbreviation • 2001 is the calendar year • NPP represents Net Primary Productivity data • asc stands for the file type--ASCII

  29. 2. Use consistent and stable file formats • Use ASCII file formats – avoid proprietary formats • Be consistent in formatting • don’t change or re-arrange columns • include header rows (first row should contain file name, data set title, author, date, and companion file names) • column headings should describe content of each column, including one row for parameter names and one for parameter units • within the ASCII file, delimit fields using commas, pipes (|), tabs, or semicolons (in order of preference)

  30. 3. Define the parameters • Use commonly accepted parameter names that describe the contents • e.g., precip for precipitation • Use consistent capitalization • e.g., not temp, Temp, and TEMP in same file • Explicitly state units of reported parameters in the data file and the metadata • SI units are recommended • Choose a format for each parameter, explain the format in the metadata, and use that format throughout the file • e.g., use yyyymmdd; January 2, 1999 is 19990102

  31. 4. Use consistent data organization (one good approach) Note: -9999 is a missing value code

  32. 4. Use consistent data organization (a second good approach)

  33. 5. Perform basic quality assurance • Assure that data are delimited and line up in proper columns • Check that there no missing values for key parameters • Scan for impossible and anomalous values • Perform and review statistical summaries • Map location data (lat/long) and assess errors • Verify automated data transfers • e.g. check-sum techniques • For manual data transfers, consider double keying data and comparing 2 data sets

  34. 6. Assign descriptive data set titles • Data set titles should ideally describe the type of data, time period, location, and instruments used (e.g., Landsat 7). • Data set title should be similar to names of data files • Good: “Shrub Net Primary Productivity at the Sevilleta LTER, New Mexico, 2000-2001” • Bad: “Productivity Data”

  35. 7. Provide documentation (metadata)

  36. Ecoinformatics solutions • Data design • Data acquisition • QA/QC • Data documentation (metadata) • Data archival

  37. High-quality data depend on: • Proficiency of the data collector(s) • Instrument precision and accuracy • Consistency (e.g., standard methods and approaches) • Design and ease of data entry • Sound QA/QC • Comprehensive metadata (e.g., documentation of anomalies, etc.)

  38. What’s wrong with this data sheet? PlantLife Stage ______________ _______________ ______________ _______________ ______________ _______________ ______________ _______________ ______________ _______________ ______________ _______________ ______________ _______________ ______________ _______________

  39. Important questions • How well does the data sheet reflect the data set design? • How well does the data entry screen (if available) reflect the data sheet?

  40. PHENOLOGY DATA SHEET Rio Salado - Transect 1 Collectors:_________________________________ Date:___________________ Time:_________ Notes:_________________________________________ _______________________________________________ PlantLife Stage ardi P/G V B FL FR M S D NP arpu P/G V B FL FR M S D NP atca P/G V B FL FR M S D NP bamu P/G V B FL FR M S D NP zigr P/G V B FL FR M S D NP P/G V B FL FR M S D NP P/G V B FL FR M S D NP P/G = perennating or germinating M = dispersing V = vegetating S = senescing B = budding D = dead FL = flowering NP = not present FR = fruiting

  41. ardi P/G V B FL FR M S D NP asbr P/G V B FL FR M S D NP deob P/G V B FL FR M S D NP arpu P/G V B FL FR M S D NP Y N Y N Y N Y N Y N Y N Y N Y N YN Y N Y N YN Y N Y N YN Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N Y N YN PHENOLOGY DATA SHEET Rio Salado - Transect 1 Collectors Troy Maddux Date: 16 May 1991 Time: 13:12 Notes:Cloudy day, 3 gopher burrows on transect

  42. Ecoinformatics solutions • Data design • Data acquisition • QA/QC • Data documentation (metadata) • Data archival

  43. Experimental Design Methods Data Design Data Forms Field Computer Entry Electronically Interfaced Field Equipment Electronically Interfaced Lab Equipment Generic Data Processing Research Program Investigators Studies Quality Control Raw Data File Data Entry Quality Assurance Checks Data verified? no Data Contamination yes Summary Analyses Archive Data File Metadata Data Validated Archival Mass Storage Off-site Storage Magnetic Tape / Optical Disk / Printouts Investigators Access Interface Secondary Users Publication Synthesis Brunt 2000

  44. Ecoinformatics solutions • Project / experimental design • Data design • Data acquisition • QA/QC • Data documentation (metadata) – to be addressed • Data archival

  45. Ecoinformatics solutions • Project / experimental design • Data design • Data acquisition • QA/QC • Data documentation (metadata) • Data archival

  46. Cycles of Research “A Conventional View” Publications Data Analysis and modeling Problem Collection Planning

  47. Cycles of Research“A New View” Archive of Data Publications Analysis and modeling Selection and extraction Collection Original Observations Secondary Observations Problem Definition (Research Objectives) Planning Planning

  48. Data Archive • A collection of data sets, usually electronic, stored in such a way that a variety of users can locate, acquire, understand and use the data. • Examples: • ESA’s Ecological Archive • NASA’s DAACs (Distributed Active Archive Centers)

More Related