1 / 25

Exploiting semantic technologies to build an application ontology

Exploiting semantic technologies to build an application ontology. James Malone PhD, Helen Parkinson PhD, Tomasz Adamusiak Phd, MD. Overview. Motivation Our use cases Annotating HTP experimental data Integrating clinical data Methodology for creating the ontology

Download Presentation

Exploiting semantic technologies to build an application ontology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exploiting semantic technologies to build an application ontology James Malone PhD, Helen Parkinson PhD, Tomasz Adamusiak Phd, MD

  2. Overview • Motivation • Our use cases • Annotating HTP experimental data • Integrating clinical data • Methodology for creating the ontology • Semi-automated mapping and manual curation • Current ontology usage • Future use Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk

  3. Our Use Cases • Query support (e.g, query for 'cancer' and get also 'leukemia') • Over-representation analysis in groups of samples (analogous to the use of GO terms in over-representation analysis in groups of genes) • Data visualisation – e.g., presenting an ontology tree to the user of what is in the database • Data integration by ontology terms – e.g., we assume that 'kidney' in independent studies roughly means the same, so we can count how many kidney samples we have in the database • Intelligent template generation for different experiment types in submission or data presentation • Summary level data • Nonsense detection – e.g. telling us that something marked as cancer can not be marked as healthy Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk 3 17.11.2014

  4. Scope of Experimental Factor Ontology (EFO) Modelling all of the experimental factors that are currently present in the ArrayExpress repository Experimental factors are variable aspects of an experiment design which can be used to describe an experiment Scope is primarily determined by data currently held in ArrayExpress clinical conditions level (e.g. disease) developmental stage sample level species level Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk 4

  5. ‘Experimental Factors’ Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk 5 17.11.2014 Developing an Experimental Factor Ontology

  6. Annotating High Throughput Data Public/Private Public Only Genes in Expts ATLAS Experiment queries > 200 species Gene level queries, 9 species Re-annotate Summarize acquire 246,000 assays Ranked gene/ condition queries • Text mining at data acquisition • Ontology driven queries • Data mining Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk 6 6 17.11.2014 17.11.2014

  7. Integrating Clinical Data • Use cases include: • Homologizing clinical data for study designs (e.g. GWAS) • … Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk

  8. Building the Experimental Factor Ontology Position of EFO in the ‘bigger picture’ • Key is orthogonal coverage, reuse of existing resources and shared frameworks Chemical Entities of Biological Interest (ChEBI) Relation Ontology Cell Type Ontology Text mining Various Species Anatomy Ontologies Anatomy Reference Ontology Disease Ontology EFO Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk 8 17.11.2014

  9. Semi-automated mapping text to ontology • Following an evaluation from Tim Rayner we selected Double Metaphone algorithm • Perform matching of our values in database to ontology class labels and definitions. • Also perform mappings from EFO to other ontologies, so that EFO: cancer = NCI: cancer, DO: cancer et al. • Sanity checking over mappings before adding to ontology Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk

  10. User Repositories Bioontologies Querying Mechanism Search Engines Mapping using Agent Technology Component 2: ontology mappings Component 3: ontology discovery Component 1: MAS architecture Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk

  11. What does agent technology buy us? • Annotation consistency • EFO_1001214 is now inconsistent because DO_15654 has new parent • Richer mappings (hence annotations) • EFO_1000156 can have new mappings because new cancer class found in MIT ontology • New potentially relevant ontologies • New ontology found relating to molecular + pathways • Semantic web compatible (i.e. can be deployed as standards compliant service) Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk

  12. EFO EFO Axes process material material property information site Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk

  13. Process process Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk

  14. Information information Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk

  15. Material material Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk

  16. Material Property material property Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk

  17. Using the ontology: Querying Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk Public repository of gene expression data Multiple sources – direct submissions, external databases >200 species 8400 experiments, 246,000 assays 17 17.11.2014

  18. Using the ontology: Atlas Querying Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk 18 17.11.2014

  19. Using the ontology:Exposing data via external resources Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk NCBO Bioportal 19 17.11.2014 Developing an Experimental Factor Ontology

  20. Using the Ontology:Detecting Nonsense: Enforcing correctness species (human) organism part (cervix) cell line (Hela) cell type (epithelial) disease (cervical adenocarcinoma) Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk

  21. Using the Ontology:Detecting Nonsense: Enforcing correctness species (human) organism part (hair follicle) disease (cardiovascular disease) Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk

  22. Using Ontology:Integrating Clinical data for Study Design Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk

  23. Future Work for EFO • Mapping in external ids on request – Snomed-CT, FMA, ChEBI, Brenda tissue ontology etc • API development for serving external ids from AE • Working with external ontologies to produce cross products • Extensions for clinical data capture Gen2Phen, Engage • Extensions for mouse model of human disease queries • Addressing ‘temporal dimension’ • Addition of units • Improving query implementation in ArrayExpress Atlas – GUI changes • Addition of synonyms • Semantic clustering of experiments Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk

  24. Conclusion • Ontology development for text mining, annotation, query Built with our needs in mind, however covers a wide range of experimental variables across a wide range of technologies, extensible, open source • Xref’d to existing ontology resources when possible • Text mining works, reduces the workload • 1.0 is released on April 1st 2009 • 0.10 version currently available in OLS and NCBO bioportal • http://www.ebi.ac.uk/ontology-lookup/browse.do?ontName=EFO • http://www.ebi.ac.uk/microarray-srv/efo/ Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk

  25. Acknowledgments Ontology creation: James Malone, Helen Parkinson, Tomasz Adamusiak, Ele Holloway Mapping tools and text mining evaluation: Tim Rayner, Holly Zheng External Specialist Review: Trish Whetzel, Jonathan Bard AE Team: Anna Farne, Ele Holloway, Margus Lukk, Eleanor Williams, Tony Burdet, Alvis Brazma, Misha Kapushesky EBI Rebholz Group (Whatizit text mining tool) EC (Gen2Phen,FELICS,MUGEN, EMERALD, ENGAGE, SLING), EMBL, NIH Exploiting semantic technologies to build an application ontology malone@ebi.ac.uk 25 17.11.2014 Developing an Experimental Factor Ontology

More Related