1 / 24

Infrastructure for Peer-Based Knowledge Sharing

Infrastructure for Peer-Based Knowledge Sharing. Peter Mork University of Washington, Seattle 21-Sep-14. Motivating Example. Microarray Experiment. Information from public databases. ??. ICAT Experiment. Outline. Integration Systems From Data to Knowledge (Metadata)

morgan
Download Presentation

Infrastructure for Peer-Based Knowledge Sharing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Infrastructure for Peer-Based Knowledge Sharing Peter MorkUniversity of Washington, Seattle 21-Sep-14

  2. Motivating Example Microarray Experiment Information from public databases ?? ICAT Experiment

  3. Outline • Integration Systems • From Data to Knowledge (Metadata) • Metadata Management • From Local to Peer • Evaluation • Declarative vs. Descriptive Mappings • Complete vs. Minimal Configurations • Conclusions

  4. Outline • Integration Systems • From Data to Knowledge (Metadata) • Metadata Management • From Local to Peer • Evaluation • Declarative vs. Descriptive Mappings • Complete vs. Minimal Configurations • Conclusions

  5. Overview of Integration Systems + Schema+ Mappings + Annotations Source API

  6. OMIM HUGO Swiss- Prot GO Gene- Clinics Locus- Link Entrez GEO Mediated Schema Entity Sequenceable Entity Structured Vocabulary Experiment Phenotype Gene Nucleotide Sequence Microarray Experiment Protein

  7. BioMediator Maintenance: Push, Limited Journal Pull Validation: Internal Creation: Human Phenotype Maintenance: Push, Yearly Expert Review Validation: External Creation: Human Maintenance: Push Validation: None Creation: Human, Algorithm OMIM Gene- Clinics Entrez

  8. Demo • Start with 6 Proteins and 6 Sequences • Find simple correspondences • Find biologically relevant clusters

  9. Outline • Integration Systems • From Data to Knowledge (Metadata) • Metadata Management • From Local to Peer • Evaluation • Declarative vs. Descriptive Mappings • Complete vs. Minimal Configurations • Conclusions

  10. Necessary Metadata • Class Hierarchy • Concepts (e.g., Protein, Gene) • Property Hierarchy • Relationships (e.g., codes-for, causes) • Mappings • Source schema  Mediated schema • Mapping Annotations • Information about maintenance and authority

  11. Schema 3 Entity Schema 1 Schema 2 Sequenceable Entity Structured Vocabulary Experiment Phenotype Gene Nucleotide Sequence Microarray Experiment Protein OMIM HUGO Swiss- Prot GO Gene- Clinics Locus- Link Entrez GEO

  12. Options for Metadata Mgmt

  13. Centralized Metadata Mgmt Entity Gene- Clinics Sequenceable Entity Phenotype Gene OMIM Nucleotide Sequence Entrez Protein Locus- Link

  14. Declarative Peer Metadata Mgmt GeneClinics: Phenotype Gene Protein OMIM: Record Q3 Q2 Gene  Record Entrez: Protein Nucleotide Seq. LocusLink: Phenotype Gene Protein Equivalent Q1

  15. Superset Descriptive Peer Metadata Mgmt OMIM_Record = Phenotype ⊔ Gene Domain(AssociatedWith) = NucleotideSequence ⊔ Gene ⊓

  16. Outline • Integration Systems • From Data to Knowledge (Metadata) • Metadata Management • From Local to Peer • Evaluation • Declarative vs. Descriptive Mappings • Complete vs. Minimal Configurations • Conclusions

  17. Experimental Setup • Centralized BioMediator = Gold Standard • Mapping Languages • PPL: Declarative • OWL: Descriptive • Peer Architectures • Complete • Minimal

  18. Complete Configuration

  19. Minimal Configuration

  20. Results: # of Successful Queries

  21. Outline • Integration Systems • From Data to Knowledge (Metadata) • Metadata Management • From Local to Peer • Evaluation • Declarative vs. Descriptive Mappings • Complete vs. Minimal Configurations • Conclusions

  22. Conclusions • More sources accessible • More power per mapping • Additional ‘redundant’ mappings provide little benefit • Less work maintaining mappings • Hidden cost: Logical mappings harder to write correctly • May interact in unforeseen ways

  23. Acknowledgements • Funding • NLM training grant T15LM07442 • NHGRI grant R01HG02288 • BioMediator Team • Advisors • Alon Halevy • Peter Tarczy-Hornoch • Wendy Kramer (grant administrator)

More Related