1 / 24

Ontrez

Ontrez. Clement Jonquet – Nigam Shah { jonquet,nigam }@ stanford.edu. Speech overview. Ontrez general idea “provide a service that will enable users to locate biomedical data resources related to their search for particular ontology terms” Functional specification and conceptual levels

pembroke
Download Presentation

Ontrez

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ontrez Clement Jonquet – Nigam Shah {jonquet,nigam}@stanford.edu

  2. Speech overview • Ontrez general idea • “provide a service that will enable users to locate biomedical data resources related to their search for particular ontology terms” • Functional specification and conceptual levels • Ontrez example • GEO dataset • Architecture and implementation • UML class diagrams • Ontrez position within NCBO project • Relation with OBD@Berkeley and OBD@Stanford • Next steps & conclusion

  3. Classic Ontrez use case A user search for information and content related to a specific disease… • Go to BioPortal to search ontologies… • The given disease name matches with several ontology terms… • For each of this term, the user get a link to all the resource elements (data sets, clinical trial, articles) annotated by this term.

  4. Ontrez challenge • Biomedical resource elements (e.g., experiments and data) in the public domain are exploding • Element = a collection of observations resulting from a biomedical experiment (experimental data sets, records of disease associations of gene products in mutation databases, entries of clinical-trial descriptions, etc.) • Resource = a collection of elements (GEO, PubMed, or other public repositories) • Researchers need tools to enable them to find all the resource elements relevant to their area of study • The problem now is locating the ‘elements’ that matters to a user. • Key challenge is to annotate (or tag) various resource elements to identify the biomedical concepts to which they relate • Annotation = an assertion declaring a relationship between a biomedical resource elements and a term in an ontology • Term = concept found in an specific ontology

  5. Creation of the annotation database Ontrez proposal • Retrieve the metadata from data resources (A) • Annotate/tag them with ontology terms using the library of ontologies in Bioportal (B) • Store the result in an annotation database

  6. Query of the annotation databaseOntrez Proposal • User queries are formu--lated as a set of terms (1) • Use of the BioPortal index to convert the query to ontology terms • Use the subsumption relations in the ontologies and the mappings in BioPortal to expand the query • Query the annotation tables with the expanded set of terms (2) • The user receives the result (3) in terms of references to the original data sources.

  7. Ontrez conceptual levels

  8. Ontrez functional specification • For a given resource, being able to: • access and update automatically resource information • access and keep locally the set of elements of this resource • automatically update the local copy if necessary • extract the structure of elements of this resource • For a given element, being able to: • extract (according to the structure) the metadata • annotate each part of the metadata with a dictionary • For a given set of ontologies, being able to: • construct a dictionary • For a given annotation, being able to: • process the transitive closure for a given set of ontologies • link back to the original resource element • For a given term, being able to: • get the resources elements annotated with this term • semantically expand to a larger relevant set of terms

  9. Ontrez example (1/2) Annotation of a GEO element

  10. Ontrez example (2/2) Return of a GEO element

  11. Ontrez in the new BioPortal prototype Example of resource available (name and description) Number of resource elements annotated with this ontology term Ontology term search by a user URL link to the original element Context in which an element has been annotated ID of an element

  12. Ontrez v.0 architecture (1/3)Concept Level

  13. Ontrez v.0 architecture (2/3)Resource Level

  14. Ontrez v.0 architecture (3/3)Index Level

  15. Ontrez implementation • Construction of the index: Java • Storage of resource elements, dictionaries, and … annotations: MySQL • Connexion to ontologies for dictionary construction and semantic query expansion: • Web services for Bioportal; • JDBC connection to MySQLDB for UMLS

  16. Concept recognition tools • University of Michigan mgrep tool • National Center for Integrative Biomedical Informatics (NCIBI) • Has a very high degree of accuracy (over 95%) in recognizing disease names • RongXu’s • BMIR PhD candidate • MetaMap Transfer (MMTX) • We have not yet conducted an evaluation • Nipun Bathia’s project BS student at Stanford

  17. First resources processed(Ontrez v0 and v1) • 5 resources for the moment (not complete) • To be completed and done with versioning, automatic updates, etc.

  18. Open Biomedical Data (OBD) NCBO Core 2 • “a database resource (…) • “that will allow expert scientists both to archive experimental data and to use the OBO ontologies and terminologies to create appropriate annotations” • CREATION • “for storing, visualizing, and analyzing the ontology-based annotations that are linked to primary experimental data” • USE

  19. Different annotation sets in OBD • OBD@stanford: • “disease oriented approach” • automatically generated annotations of text meta-data • OBD@Berkeley: • “genotype-phenotype pairsapproach” • manually generated and curator based annotations/assertions

  20. Interaction with and integration into Bioportal ? • We don’t really care one another about how these DBs are produced • Manually, with curators etc. on Berkeley side • Using a NLP tool on text meta-data on Stanford side • We have several annotations databases (Ontrez/OBD/external ones) for which we need to specify: • Annotation table structure i.e., [ elementLocalID | termID ] • Interaction with Bioportal to: • get the term IDs; • request the DB and integrating results in UI. Web services API .jar ? ? [entryID | elementLocalID | termID | itemKey | dictionaryVer]

  21. Next steps & conclusion • Resources to process completely and versioning and update mechanisms to implement • First evaluation/comparison • Research and implementation of semantic query expansion • New resources to be added • Formalization with Berkeley folks of OBD@ncbo • Results integration within BioPortal • Collaboration on what to do with annotations

  22. Thank you Any questions?

  23. Who am I? Clement, 27, French Post doc on NCBO since September 2007 at Stanford University • PhD in Informatics • University Montpellier 2 (FR) • Thesis • Multi-Agents Systems • Service oriented computing • Grid • Nothing to go with biomedical ontologies but… • www.stanford.edu/~jonquet

More Related