1 / 16

a School of Information Science, Federal University of Minas Gerais , Brazil

Requirements for Semantic Biobanks. André Q ANDRADE a,b , , Markus KREUZTHALER b , Janna HASTINGS d,e , Maria KRESTYANINOVA f,g , Stefan SCHULZ b,c. a School of Information Science, Federal University of Minas Gerais , Brazil

Download Presentation

a School of Information Science, Federal University of Minas Gerais , Brazil

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Requirements for Semantic Biobanks André Q ANDRADEa,b,, Markus KREUZTHALERb, Janna HASTINGSd,e , Maria KRESTYANINOVAf,g , Stefan SCHULZb,c aSchool of Information Science, Federal University of Minas Gerais, Brazil bMedical University of Graz, Austria, cUniversity Medical Center Freiburg, Germany dEuropean Bioinformatics Institute, Hinxton, UK;eUniversity of Geneva, Switzerland fHelsinki University, Finland, gUniquer, Lausanne, Switzerland

  2. Semantic Biobanks • Semantic interoperability: systems exchange exchange data + meaning • Formal Ontologies provide unambiguous descriptions of what is universally true for all objects of a certain type • Increasing number of biomedical vocabularies are ontology based (OBO Foundry, SNOMED CT…) • Blood, tissue sampling for research • Samples from several biobanks needed for retrieving data for a specific research question • Comprehensive annotations with lab data and clinical data Model of Meaning Data

  3. (Generalized) Biomedical Retrieval Scenario • Retrieval: • Distribution of heterogeneous resources of interest • Most retrieval scenarios recall-oriented • Resources used by multiple researchers over the world for multiple purposes • Effective retrieval depends on querying resource metadata • Provenance information • Content-based semantic annotations (structured vocabulary) • Access regulations Does this sound familiar?

  4. Analogy

  5. Analogy • Global bibliographic database • Resources: publications from different publishers • Annotations: • Bibliographic data • Abstract • Semantic representation (MeSH) on paper content • Local access conditions to the full resource apply

  6. Analogy Biobank“Broker” • Global bibliographic database • Resources: publications from different publishers • Annotations: • Bibliographic data • Abstract • Semantic representation (MeSH) of paper content • Local access conditions to the full resource apply • Global biobank sample database • Resources: biological specimens (blood, tissue,…) • Annotations: • Sample information (staining etc…) • Semantic representation of both lab and selected patient related information(Information models / ontologies) • Local access conditions to the full resource apply

  7. Data resources for biobanking • Sample related information: • Type of sample • Preparation of sample • Time • Storage information • Physical location • Associated information, lab data, genotype,… • Donor related information: • Demographic data • Phenotype data • Time indexed clinical data (EHR extracts) • Increment of relevant donor related information after samples are taken 1960 1970 1980 1990 2000 2010

  8. Centralized broker for biobanking information + + Biobank EHR Biobank + EHR * + * + * + * + Biobank EHR + Biobank EHR

  9. Centralized broker for biobanking information + + Biobank EHR Biobank + EHR * + * + * + * + Biobank EHR + Biobank EHR

  10. Centralized broker for biobanking information + + Biobank EHR Biobank + EHR * + * + * + * + Biobank EHR + Biobank EHR

  11. Language for semantic annotations of biobank data • Formal ontologies • Precise, logical descriptions of annotations and queries • High expressiveness through compositionality • OWL-DL: Semantic Web Standard for description logics: allows to formulate axioms of what is universally true of all instances of a kind • Specific components • Ground axioms provided by an upper level ontology (BioTop) • Set of disjoint upper level categories and relations, together with related constraints • Ontological description of domain: SNOMED CT, OBO Foundry…

  12. BioTop categories and example axiom

  13. Description logics representation and retrieval • “retrieve all gastric mucosa samples from before 2003 of patients who had cancer of stomach after 2008” • Representation language: OWL DL • Editor: Protégé 4.2. • Reasoner: HermiT retrieves

  14. Requirements • Formal representations • Ontological representation of information models and terminologies • Ontological representation of data about specimens • Joint, universally used clinical terminology • Expressive and stable upper level ontologies (+ ontological relations) • Scope and granularity of EHR extract of interest for biobank related queries • Specification of structure and function of central repository • Steps for information translation from legacy systems • Mappings • Interfaces • Update policies

  15. Challenges • Prototypical status of DL reasoners and editor • Performance problems with expressive ontologies • Modularization of large clinical terminologies in response to data and query under scrutiny • Organization of • Central repository • Local mappings / translations • Logistics (samples) • Privacy and IP issues • Business model

  16. Thanks Andrade et al.: Requirements for Semantic Biobanks • CAPES (Brazil) – Programa de Doutoradono País com Estágio no Exterior • FP7 – NoE SemanticHealthNet

More Related