1 / 27

User Centered and Ontology Based Information Retrieval System for Life Sciences

This system utilizes ontologies to improve information retrieval by expanding queries, measuring document/query adequacy, and providing a better overall grasp of results. User preferences and interactions are taken into account.

cbowens
Download Presentation

User Centered and Ontology Based Information Retrieval System for Life Sciences

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. User Centered and Ontology Based InformationRetrieval System for Life Sciences Sylvie Ranwez Vincent Ranwez Mohameth-François Sy Jacky Montmain Michel Crampes LGI2P Research Center / Ecole des Mines d'Alès, France ISEM – CNRS / Montpellier II University, France

  2. Overview Context and objectives Ontology based information retrieval Relevance calculus between a document index and a query Similarity between two concepts Relevance of a document with respect to a concept Relevance of a document with respect to a query Results visualization Conclusion et perspectives User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  3. Context: usual information retrieval engine Boolean search • Results are easy to understand • Exact terms matching • Number of results • Rough measurement: "match" or "does not match" • Limited interaction • Aggregating operators are not used (and, or…) • Hard to grasp even with clustering User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  4. Context: information retrieval based on a concepts hierarchy • Boolean search + specialization • Extend the query • Number of results • Results are difficult to understand • Which concepts are taken into account? • Which ones have been added? • No relevance assessment • Loss of the first query context User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  5. Context: information retrieval using ontologies ? Organelle organization (GO_0006996) Cardiac muscle fiber development (GO_0048739) Number of retrieved genes GenBankEnsembl AND 0 0 OR 13 15 User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  6. Objectives Take better benefits of ontologies during the information retrieval process (indexing/query matching) • Expand thequery if necessary • Measure document/query adequacy by identifying added concepts Favor the overall results' grasp by the user • Explain why a document has been selected • Give an overall vision of results • If a selected document is not relevant, identify why in order to reformulate the query conveniently Taking user preferences into account • Favor interactions and iterative querying process User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  7. Overview Context and objectives Ontology based information retrieval Relevance calculus between a document index and a query Similarity between two concepts Relevance of a document with respect to a concept Relevance of a document with respect to a query Results visualization Conclusion et perspectives User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  8. Ontology based information retrieval ? Organelle organization (GO_0006996) Cardiac muscle fiber development (GO_0048739) Hyponyms and hypernyms to avoid silences • Mix documents that match more or less the query • The selection may be difficult to understand Mitochondrion organisation (GO_0007005) Muscle fiber development (GO_0048747) User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  9. Overview Context and objectives Ontology based information retrieval Relevance calculus between a document index and a query Similarity between two concepts Relevance of a document with respect to a concept Relevance of a document with respect to a query Results visualization Conclusion et perspectives User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  10. Relevance calculus between a document index and a query ? INTERFACE QueryQ Domain ontology Semantic map Selected documents User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  11. Relevance calculus between a document index and a query Three-level relevance calculus • Similarity between two concepts: a concept from the document index and a concept from the query • Relevance of a document (i.e. the set of its indexingconcepts) with respect to a concept from the query • Relevance of a document with respect to a query: Fuzzy aggregation of relevance measures • Advantages • Ranking of documents with respect to their relevance • Detailed explanation of the document selection User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  12. Relevance calculus between a document index and a query Three-level relevance calculus • Similarity between two concepts: a concept from the document index and a concept from the query • Relevance of a document (i.e. the set of its indexingconcepts) with respect to a concept from the query • Relevance of a document with respect to a query: Fuzzy aggregation of relevance measures • Several similarity measurements have been proposed in literature, this one is easy to understand (% of mutual hyponyms) User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  13. Relevance calculus between a document index and a query Three-level relevance calculus • Similarity between two concepts: a concept from the document index and a concept from the query • Relevance of a document (i.e. the set of its indexingconcepts) with respect to a concept from the query • Relevance of a document with respect to a query: Fuzzy aggregation of relevance measures • Best relevance between indexing concepts of document Dand a query concept Qt • May be generalized by weighting the concepts Di(using evidencecodes in the Gene Ontology for example) User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  14. Relevance calculus between a document index and a query Three-level relevance calculus • Similarity between two concepts: a concept from the document index and a concept from the query • Relevance of a document (i.e. the set of its indexingconcepts) with respect to a concept from the query • Relevance of a document with respect to a query: Fuzzy aggregation of relevance measures • Combine individual relevance scores to estimate an overall relevance of the document • Take user preferences into account: decision theory • Yager operator (with q ) • q = 1, arithmetic mean, q  0, geometrical mean, • q = -1, harmonic mean, q  +∞, max (OR generalization) • q  -∞, min (AND generalization) User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  15. Overview Context and objectives Ontology based information retrieval Relevance calculus between a document index and a query Similarity between two concepts Relevance of a document with respect to a concept Relevance of a document with respect to a query Results visualization Conclusion et perspectives User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  16. Visualization ? Organelle organization (GO_0006996) Cardiac muscle fiber development (GO_0048739) • A document may be selected even if its index contains no terms of the query Explain the selection to the user: pictograms • Each concept of the query is associated with a bar: • Its height is proportional to its relevance • Its color says if • index the document ( ) • specialize (is an hyponym of) • generalize (is an hypernym of) Mitochondrion organisation (GO_0007005) Muscle fiber development (GO_0048747) User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  17. Visualization Pictograms are displayed on a semantic map • Their physical distance to the query is proportional to their relevance score: • Visualization and navigation: fit the human cognitive limits (lens, number of results, relevance threshold…) and help the user (selection of concept for the query…) User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  18. Overview Context and objectives Ontology based information retrieval Relevance calculus between a document index and a query Similarity between two concepts Relevance of a document with respect to a concept Relevance of a document with respect to a query Results visualization Conclusion et perspectives User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  19. Results Find more documents (avoid silences) Improve the relevance: documents ranking Explain relevance calculus (diagnose) Visualize overall results Interaction with the list of retrieved documents: customize user preferences Iterative improvement of the query Perspectives Improve CHI Suggest query reformulation From documents selection by the user (weighting + complement) Underline query terms that are discriminated Test several semantic distance calculus on different benchmarks (TREC, Much more…) Improve visualization Filter the displayed results using sub-ontologies extraction Propose a view of the results underlining clusters Propose an online version Conclusion and perspectives User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  20. User Centered and Ontology Based InformationRetrieval System for Life Sciences sylvie.ranwez@mines-ales.fr vincent.ranwez@univ-montp2.fr mohameth.sy@mines-ales.fr jacky.montmain@mines-ales.fr michel.crampes@mines-ales.fr

  21. Visualization OBIRS on line: http://www.ontotoolkit.mines-ales.fr/ObirsClient/ User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  22. Ontology based information retrieval Query Documents ? REFORMULATION Analyze/Indexation Indexation Domain ontology Match Documents' index Query Index Relevant documents User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  23. Visualization User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  24. Avant-propos : retours sur la pertinence des résultats • Pertinence apprise à partir d'annotations • Peut mutualiser les indexations de certaines bases de données • Taguage possible avec des mots clés personnels hiérarchisés mais • Liste des résultats peut être longue • Pas de justification sur la pertinence • Filtres mais non sémantiques • Pas de vision globale User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  25. Calcul de pertinence d'un document par rapport à une requête Il existe de mesures de distance entre des ensemble de concepts • Entre indexations réalisées avec GO, par exemple cependant la mesure de pertinence d'un document par rapport à une requête • Ne doit pas être symétrique • Doit permettre de détailler le score de chaque terme de la requête User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  26. GO:0006996 • ENSG00000115204ENSG00000115204ENSG00000025708ENSG00000025708ENSG00000025708ENSG00000025708ENSG00000151729ENSG00000078142ENSG00000139112ENSG00000170296 • GO:0048739 • Ensembl Gene ID ENSG00000154639ENSG00000197616ENSG00000197616ENSG00000133454ENSG00000133392ENSG00000173991 User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

  27. Nervous System Diseases Neurologic Manifestations Pathological Conditions, Signs and Symptoms Central Nervous System Diseases Brain Diseases Sign and Symptoms Headache Disorder Pain Headache Disorder, Primary … Headache Migraine = Migraine Disorder Migraine Disorder with Aura Migraine Disorder without Aura Ontologie : ex tiré du MeSH User Centered and Ontology Based Information Retrieval System for Life Sciences – S . Ranwez

More Related