1 / 42

Towards a Data Model for the Australian Microbial Resources Information Network (AMRiN)

Towards a Data Model for the Australian Microbial Resources Information Network (AMRiN). Version: 0.03 17/09/2010. Lynette Woodburn Atlas of Living Australia. TIP. Each slide in this presentation comes with accompanying Notes.

marlee
Download Presentation

Towards a Data Model for the Australian Microbial Resources Information Network (AMRiN)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards a Data Model for the Australian Microbial Resources Information Network(AMRiN) Version: 0.03 17/09/2010 Lynette Woodburn Atlas of Living Australia

  2. TIP • Each slide in this presentation comes with accompanying Notes. • You can’t see them if you display this presentation in ‘Slide Show’ mode. • If you’d like to see the Notes • view the presentation in ‘Normal’ mode, and • expand the pane below the slide (the Notes pane) to see extra text. • Only then will you have a chance of understanding all the crazy diagrams. 

  3. Towards a data model for AMRiN Requirement a standard set of data fields for all micro-organisms . to support the sharing and integration of data through AMRiN . to pre-configure BioloMICS Options . choose an existing set . develop something new Recommendation . surprise!

  4. Requirements • Options • Recommendation

  5. AMRiN community AMRiN

  6. AMRiN AMRiN community

  7. AMRiN AMRiN community

  8. Requirements • Options • Recommendation • - existing • CABRI • MCL

  9. Common Access to Biological Resources and Information CABRI a European organization of partner collections who contribute data to searchable‘catalogues’ covering • bacteria & archaea • fungi & yeasts • animal & human cell lines • plant cell lines • hybridomas • phages • plasmids • plant cell viruses • genomic libraries http://www.cabri.org/

  10. 26 • 23 • 29 • 17 • 15 • 33 • 30 • 12 • 7 CABRI’s sets of data elements elements per set Isolated_from • bacteria & archaea • fungi & yeasts • animal cell lines • plant cell lines • hybridomas • phages • plasmids • plant cell viruses • genomic libraries Doubling_time Morphology Lysogenicity Original_host_plant

  11. Common Access to Biological Resources and Information CABRI • For each different kind of biological resource, • CABRI defines nested sets of data elements Mandatory Recommended Full

  12. CABRI : bacteria & archaea Mandatory Recommended Full Strain_numberOther_collection_numbersRestrictionsOrganism_typeNameInfrasubspecific_namesStatusHistoryConditions_for_growth Form_of_supply SerovarOther_namesIsolated_fromGeographic_originMutantGenotypeLiterature Sexual_statePathogenicityEnzyme_productionMetabolite_productionApplicationsCatalogue_entryRemarksPrice_codePlasmids

  13. CABRI : fungi & yeasts Mandatory Recommended Full Strain_numberOther_collection_numbersNameStatusOrganism_typeHistoryRestrictionsForm_of_supplyConditions_for_growth Misapplied_namesRace Substrate Geographic_origin Literature Applications Mutant Sexual_state Price_code Remarks Pathogenicity Metabolite_production Enzyme_production Genotype

  14. CABRI : animal & human cell lines Mandatory Recommended Full Accession_numberCell_line_nameBrief_descriptionDescriptionDepositorBibliographic_referencesMorphologyCulture_conditionsVirusesPropertiesRelease_conditionsHazard TumorigenicityKaryologyFreezing_mediumSterilityValidation_assaysFurther_bibliographyCommentsStorageDoubling_timeMycoplasmaFingerprintCytogeneticsKaryotypeCommentsResearch_council_depositBIOMED_1 Passage_numberSpecies_validation

  15. CABRI’s sets of data elements • bacteria & archaea • fungi & yeasts • animal cell lines • plant cell lines • hybridomas • phages • plasmids • plant cell viruses • genomic libraries • 26 • 23 • 29 • 17 • 15 • 33 • 30 • 12 • 7 192

  16. Sharing data about one kind of biological resource is easy eg. phages

  17. Sharing data about one kind of biological resource is easy eg. plasmids

  18. Sharing data about multiplekinds of biological resources is hard Other_culture_collection_numbers Other_collection_numbers

  19. genomic libraries bacteria & archaea plant cell lines hybridomas fungi & yeasts plant cell viruses phages animal cell lines plasmids What is the prospect of deriving a common model from CABRI for describing several different kinds of biological resources ? 133 distinct data elements … … distributed across 9 sets

  20. CABRI as a common model ? each of 92 elements is found in only one set only 41 elements are found in more than one set

  21. CABRI as a common model ? 27 data elements are found in two sets 10 ….. in three 4 ….. in four No elements are found in more than 4 sets

  22. bacteria & archaea • fungi & yeasts • animal cell lines • hybridomas • phages • plant cell lines • plant cell viruses • plasmids • genomic libraries Distribution of data elements across CABRI sets Count of data elements

  23. CABRI data element ‘themes’ handling & distribution regulations Name / classification of item ID of item in collection care / maintenance characteristics item admin literature origin …. • bacteria & archaea • fungi & yeasts • animal cell lines • plant cell lines • hybridomas • phages • plasmids • plant cell viruses • genomic libraries

  24. CABRI : comparison of elements across sets • different names, same meaning (definition) Morphology, Morphology_and_growth History, History_of_deposit Accession_number, Strain_number Bibliographic_references, Reference_paper, Literature, Reference, Further_bibliography Restricted_distribution, Release_conditions, Restrictions, Distribution ….

  25. CABRI : comparison of elements across sets • same name, different meanings Brief_description Type

  26. CABRI : comparison of data element sets • varying levels of scope

  27. CABRI : fitness for our purpose • 9 sets of data elements (but does not cover algae) • good for sharing information about one kind of organism • few elements common to several sets • hard to share information about more than one kind of organism • does not lend itself to the derivation of a common set • elements of ‘different names, same meaning’ • elements of ‘same name, different meanings’ • elements with meanings of varying scope • has international acceptance / presence (but no longer funded?)

  28. Requirements • Options • Recommendation • - existing • CABRI • MCL

  29. Microbiological Common Language MCL • a new data exchange standard for microbiological information • Research in Microbiology, 161(6), 439-445 • http://www.straininfo.net/projects/mcl • a pluggable framework, easily extended • has the same ancestor as CABRI (MINE) • underpins StrainInfo (www.straininfo.net) • “ a world-wide, virtual catalog integrating the information from BRC [Biological Resource Centres] catalogs with related information”

  30. CABRI compared with MCL CABRI MCL partitioned by kind of biological resource partitioned by workflow step

  31. The abstract model of Microbiological Common Language (MCL) Strain Deposit Culture Isolation Sample Medium Publication … follows the logical flow from sampling to subsequent deposits

  32. mcl : Sample Sample sampleDate sampleCollector sampleCollectorInstitute sampleCulture sampleCultureStrainNumber sampleDescription sampleLocationDescription sampleLocationCountry sampleLocationPlace sampleHabitat sampleHabitatEnvoTerm sampleAlt sampleLat sampleLong comments

  33. publication nomenclaturalPublication environmentPublication historyPublication taxonomicPublication mcl : Culture Culture id oxygenRelationship history [growthTemperature] isolationDate minimalGrowthTemperature isolator optimalGrowthTemperature isolatorInstitute maximalGrowthTemperature isolationMethod hasSample recommendMedium speciesName typeStrainOf typeStrainOfSpecies typeStrainOfGenus strainNumber otherStrainNumber [otherStrainNumbers] cultureLastUpdateDate catalogURL comments

  34. Sample publication nomenclaturalPublication environmentPublication historyPublication taxonomicPublication hasSample recommendMedium Medium Publication some Object Properties Culture

  35. mcl : Medium mcl : Publication Medium mediumName Publication mediumNumber mediumURL mediumDescription dcterms: bibliographicCitation comments dc:title dc:creator prism:publicationName prism: volume prism:number prism:startingPage prism:pageRange dcterms:issued

  36. MCL : fitness for our purpose • MCL offers a broadly-applicable suite of data elements • . data elements are grouped according to workflow steps, not organism type • . applicable to algae and cyanobacteria • . the Strain concept supports the logical linking of related cultures • the model is modular and easily extensible • . model cohesion is achieved through Object Properties • . links easily with genomic standards (see StrainInfo) • born and raised in Europe (StrainInfo), but now going global • . Asian biorepositories network is considering adoption • . we’re invited to contribute to ongoing development • primarily devised (custom-built) as a data exchange standard

  37. Requirements • Options • Recommendation

  38. Recommendation : dip a toe into the water • MCL, custom-built for describing microbiological data, deserves consideration • Proposal • undertake a pilot, involving a small group of AMRiN participants, • to assess the suitability of MCL for AMRiN’s purpose.

  39. AMRiN community AMRiN

  40. AMRiN participants’ input map local elements to MCL elements identify local elements to be kept ‘private’ identify other local elements to be shared ; provide English definitions to enable reconciliation with other participants’ elements Note: some MCL elements may not have a local equivalent

  41. Pilot assessment • Coverage? How much orange overlaps purple? • What additional common elements exist amongst the set to be shared? How much purple overlaps purple? • Other assessment criteria?

  42. Pulling the pieces together Please consider the foregoing proposal. Does it seem reasonable to you? Do you think there’s a better way?

More Related