1 / 36

p resentation by Randall Schuh, American Museum of Natural History

NSF ADBC Digitization TCN-TTD Plants, Herbivores, and Parasitoids A Model System for the study of Tri-Trophic Associations Ten months later…. p resentation by Randall Schuh, American Museum of Natural History Rob Naczi, New York Botanical Garden

clove
Download Presentation

p resentation by Randall Schuh, American Museum of Natural History

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NSF ADBC Digitization TCN-TTDPlants, Herbivores, and ParasitoidsA Model System for the study of Tri-Trophic AssociationsTen months later… presentation by Randall Schuh, American Museum of Natural History Rob Naczi, New York Botanical Garden Christiane Weirauch, University of California Riverside Katja Seltmann, American Museum of Natural History , http://tcn.amnh.org

  2. The Tri-Trophic ApproachCapturing Data for the Nearctic Biota 85% of 11,000 Hemiptera from the Nearctic are herbivorous with high host specificity Bias in plant groups attacked, e.g., , Pinaceae, Poaceae, Asteraceae, Chenopodiaceae, Rosaceae Some serious agricultural pests (armored scales, mealy bugs, potato leafhoppers, Lygus bugs) Vectors of viral and bacterial diseases (green peach aphid is a vector of over 100 plant viruses) Parasitic Hymenoptera are beneficial as biological control agents

  3. Botanical Institutions MAINE MIN MICH NYBG WIS EMC ISC ILL ILLS MU COLO MO KANU TEX

  4. Botanical Institutions Botanical Data Providers CPNH MAINE MIN MICH NYBG WIS EMC ISC ILL ILLS MU COLO CCH MO KANU SEINET TEX

  5. Botanical Institutions Botanical Data Providers Entomological Collections CPNH MAINE OSAC UMEC MIN CUIC MICH NYBG WIS EMC ISC AMNH CMNH CDFA INHS CAS ILL ILLS UDCC EMEC CSUC MU COLO SEMC CCH MO KANU UKIC NCSU UCRC SEINET MEM TEX TAMU BPBM

  6. Project management • Steering Committee of 10 PIs + Project Manager • Decision-making on overall project goals, directions, and progress • Full-time Project Manager at AMNH (Katja Seltmann) • Day-to-day project management, technical capability, data analysis, training of entomology partners, vetting and upload of authority files, centralized georeferencing • Full-time Project Coordinator at NYBG (Kim Watson) • Training of botany partners, barcoding of NYBG specimens, and label-data capture for all partner institutions

  7. Entomological Databasing

  8. Streamlined Interface for Rapid Data Entry Taxon names Locality data Collection Events Specimen Data Host names

  9. Database Attributes Database Benefits Web enabled Open-source software Centralized data storage, backup, and management Single-product management Simplified user training Centralized authority-file management Centralized georeferencing Data aggregation shifted to HUB and DiscoverLife.org

  10. Authority Files Botanical Tropicos database used across entire project Entomological Published catalogs and unpublished lists from specialists Objectives Present uniform up-to-date taxonomy Reduce decision making by data-entry personnel Limit entry of new names by data-entry personnel

  11. Data Aggregation and Dissemination------------------------leveraging DiscoverLife.org

  12. Approaches to Outreach AMNH Short Course in Collection Databasing Fundamentals Train graduate-students through participant-support funding Involve students from multiple graduate programs Provide fundamentals, including database options, data structures, unique specimen identification, specimen handling, georeferencing, research tools, data dissemination Undergraduate Research Projects REU projects joining project data to student research involvement Community Outreach http://research.amnh.org/pbi/heteropteraspeciespage/

  13. Rob Naczi New York Botanical Garden

  14. Botanical Specimen Imaging

  15. Insect Specimen Imaging • Image representative specimens for each species • Use existing imaging stations at partner institutions • About 30% of Hemiptera are already imaged • Expect to produce about 20,000 new images

  16. Use of OCR for Populating Botanical Records Workflow • jpgs of specimen sheets batch-cropped to labels • labels saved as new set of jpgs, then exported toABBYY Fine Reader 11 Corporate Edition • overnight, labels batch-processed through ABBYY • each OCR output file saved as individual text file tied to barcode no. • individual text files merged into Excel spreadsheet, in which data can be searched, grouped, and parsed • parsed fields pushed to database Challenges • increasing accuracy of parsing • hand-written labels (now experimenting with out-sourcing)

  17. Data Storage Issues Botany • botanical images are valuable products of our digitization efforts, but also challenges, due to storage demands • our concern is with long-term storage (archiving) of uncompressed, original images • have encouraged home institutions of our partners to step up, but some unable/unwilling • our solution for now is storage on portable drives, but this is tenuous fix and not reliable enough for truly archival storage Entomology • no major issues

  18. Christiane Weirauch University of California Riverside

  19. Subcontract Management Setup 7 collaborating institutions, 27 subawards Benefit: long-term data capture across >30 institutions Issues 1) Delays: administrative and accounting issues 2) Database selection: which one to use? 3) Training: onsite versus remote training? 4) Tracking productivity of subawards not using PBI database Solutions/suggestions 1) Streamlined administrative and accounting procedures 2) Encourage use of a default database; more discussion 3) Combination of onsite and remote training and monitoring 4) Regular contact with subawards

  20. Unique Specimen Identifiers (USIs) • Setup: Matrix codes (barcode scanner) and string of prefix and 8-digit number (human eye) encode the same unique identifier • Benefit: Tracking of specimens; connect images to records • Format: Prefix (8 characters): acronym and identifier: e.g., UCRC_ENT XXXXXXXX • Non-standard USIs: accepted in the database • Exceptions: collections that were previously databased without USIs (e.g., Aphidoidea, certain mirid taxa) AMNH Matrix-code labels

  21. Collection Staging Organizing, sorting, and identifying specimens in preparation for databasing • Importance: highest identification level and accuracy will yield most useful data for future applications • Priority: well-curated and well-identified collections • TTD: limited budget for staging by experts; very successful for , e.g., Miridae and Membracidae • Issue: routine staging more time-consuming than anticipated • Possible solution: budget for graduate students or post docs to help with staging (and training/supervision of databasing crew)

  22. Tri-trophic concept: Hemiptera, plants, parasitoids Capture of host data New TTD records: 26% with host records (compared to 24% previously databased); added >800 new hosts • Challenges of integrating parasitoid data • Level of identification of parasitoids (undescribed species; accurate identification requires skilled personnel) • Level of host identification (e.g., “white fly”) • Incorporation of host information from secondary sources (e.g., taxonomic literature)? • On the right track; prioritize specimens with quality host records & integrate secondary host information

  23. Katja Seltmann The American Museum of Natural History

  24. Efficiency of Data Capture: Insects • Total as of October 17, 2012 = 198,409 • Includes Illinois, Texas, and Kansas • All 20 subcontracts are digitizing now • 53 contributors for ttd-tcn project Numbers from NHCR database (central database at AMNH – 11 subcontracts) • $20,000 in equipment costs • Specimens per min average: 3-3.5min/specimen (range 1.2-6) • Cost per specimen: $.93 (includes equipment) • Peak in July (more hours digitizing) • 65 collecting events on Christmas Day

  25. Efficiency of Data Capture: Plants All but three institutions up and running • As of October 9, 2012 have 102,651 images • 3 of 15 institutions not yet begun • 4 plant collections report: • $30482.51 equipment costs • $.73 cents a specimen image • The unmentioned curator volunteerism • 4-8 hrs/week depending on institution/taxon • ~19 hours a week total

  26. Training Methods: Insects (NHCR Database) • Curators also training (sexing specimens, database) • Online training via Skype • Digitizers clubhouse (building community) • Online manuals • Online videos • Remote training • Using central db can access quality of data • Flag when new name is entered • Flag when more than 10 specimens entered in one min by one person • Flag when exact duplicate collecting events or localities (check training)

  27. Training Methods: Plants • Site visits to subcontract institutions • Kim Watson, Melissa Tulig • Install imaging equipment • Personal involvement

  28. Quality Assessment of Transformed Records (NHCR) Determination Note Language Completeness (A,B,B) ; (A,A,A) ; (A,C,B)

  29. Georeferencing: NHCR database 130,000 specimen records Canada 1496 USA 14418564 Mexico 32474 Present total:14879134

  30. Georeferencing: NHCR database • GEOLocate (North America) • Discover Life validation • Centralized and controlled georeferencing (NYBG, AMNH) • Volunteer georeferencing

  31. Difficult data Issues: specimen relationships

  32. Difficult data Issues: means for curation?

  33. Summary and Predictions: • over 50,000 locality records from NHCR • will reach 1 million new specimen • records for insects (harder to predict • for plants at the moment) • less than $1 a specimen (inclusive) • Arthropod (NHCR) data concerns • will become more central as other • groups come online

  34. Thanks to National Science Foundationco-PIs and collaborators http://tcn.amnh.org

More Related