10 likes | 259 Views
Primary Immunodeficiency Disease (PID) PhenomeR (An integrated web-based ontology resource towards establishment of PID E-clinical decision support system). RAPID, IDR and Literature. Phenotype annotation tool. Collected PID Phenotypes terms. Mapped terms using Standard sources
E N D
Primary Immunodeficiency Disease (PID) PhenomeR(An integrated web-based ontology resource towards establishment of PID E-clinical decision support system) RAPID, IDR and Literature Phenotype annotation tool Collected PID Phenotypes terms Mapped terms using Standard sources Human Disease (DOID) Human Phenotype Ontology (HPO) Online Mendelian Inheritance in Man - Metathesaurus source processing (OMIM-MTHU) Symptom Ontology (SYMP) Systematized Nomenclature of Medicine Clinical Terms (SNOMEDCT) The Unified Medical Language System - Concept Unique Identifiers (UMLS_CUI) Is Mapped ? PID quality check by Logic based assessment method Conservativity principle Consistency principle Yes Locality principle PID quality check by semi-automated method No OWL, RDF files generation Phenotype ontology database PID Phenotype KnowledgeBase Search and Query interface - "PhenomeR" Subazini Thankaswamy Kosalai and Sujatha Mohan1 1Research Unit for Immunoinformatics, RIKEN Research Center for Allergy and Immunology, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan. • ABSTRACTThe main challenge for in silico genotype-phenotype correlation for any genetic diseases is to standardize phenotype ontology terms and the genotype data. Earlier, we have developed and established a molecular disease database named RAPID—Resource of Asian Primary Immunodeficiency Diseases (PID) (http://rapid.rcai.riken.jp), a web-based informatics platform which enables PID experts to easily mine collected genomic, transcriptomic, and proteomic data of PID causing genes. At present, RAPID comprises a total of 265 PIDs and 243 genes, out of which 233 genes are reported with over 5000 unique disease-causing mutations annotated from about 1800 PubMed citations as of February 2013. We, hereby, introduce a newly developed PID ontology browser, “PhenomeR” (http://rapid.rcai.riken.jp/ontology/v1.0/phenomer.php),for systematic integration and analysis of PID phenotype with the genotype data that are taken from RAPID. It currently holds 1438 PID-phenotype terms that are mapped and standardized using logic based assessment approach and represented in the form of Web Ontology Language (OWL) and Resource Description Framework (RDF) formats using semantic web technology for easy data exchange and validation, and interpretation of PID phenotype-genotype correlation using various computational approaches.The motivation for the development of PhenomeR is mainly to assist researchers and clinicians to identify reported and novel PID-causing genes as well as to determine genes involved in PID through the identification of reported disease-causing mutations and their respective observed symptoms. In essence, PID PhenomeR serves as an active integrated platform for PID phenotype data, wherein the generated semantic framework is implemented in the integrated knowledge-base query interface i.e. SPARQL Protocol and RDF Query Language (SPARQL) endpoint for establishing a well-informed PID e-clinical decision support system. Overview of PID-phenomeR (A) DATA COLLECTION No • PID-phenomeR features • Presents a web-based user friendly interface for accessing, querying browsing and analyzing PID phenotype terms • Integrates semantically standardized phenotype vocabularies from RAPID along with PIDs, genes and disease-causing mutations into a relational ontology for inference of genotype-phenotype correlation • Provides PID-phenotype data in various standardized downloadable options - OWL, RDF and Excel formats for easy sharing and data exchange among other interested research groups • Displays the phenotype terms in tree structure using NCBO widget • Facilitates integrated knowledgeBase query interface - SPARQL Protocol and RDF Query Language (SPARQL) • Promotes a network of active open community-driven semantic web technology RAPID - Home page Yes (B) DATA STANDARDIZATION Masuya, H., Y. Makita, et al. (2011). "The RIKEN integrated database of mammals." Nucleic Acids Res. 39:D861-70. No Yes No RDF and OWL formats viewed in Link Data and Protégé R E S P O N S E (C) DATA STORAGE & RETRIEVAL Statistics PID PhenomeR – Download Option RDF file generated using OWL Syntax Converter Q U E R Y PID PhenomeR Advanced search options Successful outcome and challenges PhenomeR aims to build hierarchical ontology class structures and entities of all observed PID phenotypic terms that can be further used as integrated knowledgebase query interface - SPARQL Protocol and RDF Query Language (SPARQL)for screening and implementing algorithms to compile data from multiple sources to measure statistically significant dataset with greater sensitivity, specificity and degree of confidence towards well-informed clinical decision support system. The mapping of unmapped terms from the PhenomeR is a challenging task, since some of them are not available in any of the databases. This ongoing pursuit will soon implement a systematic integrated approach for mapping all these unmapped new terms towards an open community-driven semantic web (SW) technology. PhenomeR enables easy access, search, query and analyze PID phenotype terms associated with genes, diseases and mutations Reported list of genes Reported list of mutation data Reported list of mutation data Search result of phenotype term beginning with ‘Recurrent’ Mutation analysis of STK4 gene CONCLUSION Overall, this kind of analysis should bridge a gap between genotype and phenotype correlation thereby improving phenotype-based genetic analysis of PID genes. Moreover, it should facilitate clinicians in confirming early PID diagnosis and also helpful in implementing proper therapeutic interventions. We sincerely believe that the presented structured data format in RPO should help in augmenting biomedical researchers to do further analysis computationally and also assisting clinicians in identification of diagnosed PID Multiple terms search output Hyperlinked PubMed reference citation Search result of PID phenotype term with category ‘Cardiovascular’ PID PhenomeR – Download Option – OWL format Publications – PID project Search result of PID phenotype term with semantic type - ‘Acquired Abnormality’ PID PhenomeR project in NCBO BioPortal http://bioportal.bioontology.org/projects/171 Subazini Thankaswamy Kosalai and Sujatha Mohan. PID PhenomeR- An integrated platform for developing phenotype ontology structures for primary immunodeficiency diseases (Database, Oxford University Press - In communication) Acknowledgements The authors acknowledge RIKEN for providing necessary computing resources, the research team at the Institute of Bioinformatics (IOB), Bangalore India for their collaboration in developing RAPID, and alumni of our lab as well as all PID physicians involved in the PID Japan project for their valuable input and suggestions. Collaboration and funding The PID project has been initiated by the IOB and the Immunogenomics research group at Research Centre for Allergy and Immunology (RCAI), RIKEN Yokohama Institute, Japan and it was funded by The Asia S&T Strategic Cooperation Promotion Program, Special Coordination Funds for Promoting Science and Technology, MEXT, Japan. All distinct subjects from RPO ontology queried using SPARQL Registration form for submitting new PID terms RPO summary page in NCBO BioPortal (http://bioportal.bioontology.org/ontologies/3114) Contact: sujatha@rcai.riken.jp PID PhenomeR Database Schema Home page Term C3 deficiency viewed using Protégé 4.1 OntoGraf Primary information page of STK4 gene in RAPID Master list of PID phenotype terms, associated features and relationships in Excel format Search result of phenotype term Term hierarchy visualization using NCBO widget from NCI thesaurus