1 / 13

Arctos/TACC Collaboration Chris Jordan Texas Advanced Computing Center

Arctos/TACC Collaboration Chris Jordan Texas Advanced Computing Center. Arctos: A 15 year history. MVZ: 1995 - Hired Stan Blum to develop relational data model (following modeling by Assoc. Systematic Collections).

lcothern
Download Presentation

Arctos/TACC Collaboration Chris Jordan Texas Advanced Computing Center

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Arctos/TACC CollaborationChris JordanTexas Advanced Computing Center

  2. Arctos: A 15 year history • MVZ: 1995 - Hired Stan Blum to develop relational data model (following modeling by Assoc. Systematic Collections). • MVZ: 1997 - Hired John Wieczorek to implement model (desktop application) using Sybase and Versata. Partial implementation (e.g., no loans). • UAM: 1998-2000 - John W. migrated mammal data to Oracle, set up Versata. • UAM: 2002 - Dusty McDonald replaced Versata with ColdFusion, implemented full model (first web-based instance,aka Arctos). • MSB: 2003 – Joined Arctos at UAM (first multi-hosting instance). • MVZ and MCZ: 2005-2007 - Implemented separate instances of Arctos at Berkeley and Harvard (MVZ: first Postgres, then Oracle). • MVZ: 2009 - Moved hosting of data to Alaska (Virtual Private Database version).

  3. Major repositories using the Arctos database: (34 collections of specimens or observations, 1.3M records)

  4. TACC and TeraGrid • 10-year history of Research Cyberinfrastructure • Supercomputing, Visualization and Storage • Supported by NSF to provide research resources • TACC expansion of Data-focused support • 1 Petabyte dedicated online disk • 10 Petabytes offline archive • National network of replication resources

  5. Data Diversity at TACC Image Collections (Natural History, Art, etc) Structured Data (Economics, Public Health) BioMolecular Data (DNA, RNAseq, etc) Physical Sciences/Simulation Data Geographic data (Climate, Disaster Preparedness) Integrated Infrastructure Supports Diverse Collections

  6. Arctos is… A versatile online collections management system • Cataloged Items (ID, attributes, parts, etc.; batch uploading, downloading, editing; encumbrances) • Localities & Collecting Events (mapping, media, history) • Transactions (loans, accessions, borrows, permits; email reminders) • Usage (publications, projects, sponsors, GenBank) • Curatorial (object tracking, parts, condition, relations, etc.) • Determination history (identification, georef, attributes)

  7. Breadth of Data in Arctos • Fish, amphibians, reptiles, mammals, birds and bird eggs/nests, plants, arthropods, fossils, molluscs • Specimens and observations • Media (images, audio) • Publications, fieldnotes Arctos constantly evolving to incorporate new kinds of data, e.g.,: • Better representation of non-publication documents (fieldnotes, correspondence) • Cultural collections (art, anthropology...) Nearly all that is known about an object (or observation) can be included in Arctos.

  8. Arctos/TACC Partnership • Arctos hosts web/database resources • TACC hosts media collections • Images, Recordings, etc • Simple workflows for automated generation of thumbnails, JPG versions, MP3s, OCR • Replication policies automatically replicate to various storage locations • Images directly served from TACC to browsers

  9. Arctos/TACC History Initial work with UAF Herbarium in 2008 Brought on MVZ Collections in 2009 Ongoing work on web audio, OCR New collections from UAF, UNM, others Currently >300,000 digital objects under management Support >100,000 downloads of original scans each year

  10. Advantages for Collections Lower cost and management overhead Highly reliable, large-scale infrastructure No scalability issues Longer-term partnerships promote technical collaboration to add capabilities over time Provides built-in “Data Management Plan”

  11. Long-Term Sustainability TACC plan is to be a permanent research data resource Arctos will evolve over time but the collections have permanent value Infrastructure foundation is stable Agency funding future is uncertain Develop diverse funding sources and models to support robust, long-term operation

  12. Ongoing Efforts Expansion of storage resources at TACC (~10PB online disk) Greater engagement in data management activities Working with BRC, ADBC awards and associated data iPlant Data/Genetic resources – link to specimen records?

  13. Thanks for your Time Steffi Ickert-Bond, UAF Gordon Jarrell, UNM Carla Cicero, MVZ Michelle Koo, MVZ Dusty Mcdonald, Arctos

More Related