590 likes | 738 Views
PATO & Phenotypes: From model organisms to clinical medicine. Suzanna Lewis September 4th, 2008 Signs, Symptoms and Findings Workshop First Steps Toward an Ontology of Clinical Phenotypes.
E N D
PATO & Phenotypes:From model organisms to clinical medicine Suzanna Lewis September 4th, 2008 Signs, Symptoms and Findings Workshop First Steps Toward an Ontology of Clinical Phenotypes
Describing phenotype using ontologies will aid in the identification of models of disease & candidate causative genes • GWAS: Genome Wide Association Studies • Any study of genetic variation across the entire human genome that is designed to identify genetic associations with observable traits (such as blood pressure or weight), or the presence or absence of a disease or condition. • Given an identified gene, then what?
Animal disease models Animal models Mutant Gene Mutant or missing ProteinMutant Phenotype (disease model)
Animal disease models Humans Animal models Mutant Gene Mutant or missing ProteinMutant Phenotype (disease) Mutant Gene Mutant or missing ProteinMutant Phenotype (disease model)
Animal disease models Humans Animal models Mutant Gene Mutant or missing ProteinMutant Phenotype (disease) Mutant Gene Mutant or missing ProteinMutant Phenotype (disease model)
Animal disease models Humans Animal models Mutant Gene Mutant or missing ProteinMutant Phenotype (disease) Mutant Gene Mutant or missing ProteinMutant Phenotype (disease model)
Phenotype data mining = text searching? • Text-based phenotype resources: • OMIM (NCBI) • DECIPHER (Sanger) • HGMD (Cardiff) • Disease-specific databases • MODs • PubMed
Information retrieval from text-based resources (OMIM) is not straightforward: Thanks to: M Ashburner
Even if we can find what we are looking for in one organism, how can we associate that with phenotypes observed in different organisms? Methods to link phenotypic descriptions of human diseases to animal models currently don’t exist.
Goal: Turn text-based phenotypes into ontology-based computable annotations • Define a model for representing phenotypes
SHH-/+ SHH-/- shh-/+ shh-/-
Phenotype (clinical sign) = entity + attribute
Phenotype (clinical sign) = entity + attribute P1 = eye + hypoteloric
Phenotype (clinical sign) = entity + attribute P1 = eye + hypoteloric P2 = midface + hypoplastic
Phenotype (clinical sign) = entity + attribute P1 = eye + hypoteloric P2 = midface + hypoplastic P3 = kidney + hypertrophied
Phenotype (clinical sign) = entity + attribute P1 = eye + hypoteloric P2 = midface + hypoplastic P3 = kidney + hypertrophied PATO: hypoteloric hypoplastic hypertrophied ZFIN: eye midface kidney +
Phenotype (clinical sign) = entity + attribute Anatomical ontology Cell & tissue ontology Developmental ontology Gene ontology biological process molecular function cellular component + PATO (phenotype and trait ontology)
Phenotype (clinical sign) = entity + attribute P1 = eye + hypoteloric P2 = midface + hypoplastic P3 = kidney + hypertrophied Syndrome = P1 + P2 + P3 (disease) (package) = holoprosencephaly
Phenotype annotation model Genetic Environment Evidence Qualifier Assertion Source Entity Quality relationship Attribution Properties Units Who makes the assertion When, what organization
represents Dev Biol 2005 Jul 15;283(2):357-72 “Sonic hedgehog is required for cardiac outflow tract and neural crest cell development” OBD and annotations subj relation obj annotation Absence of aorta investigator read observation publish/ create Information entity bio-entity Experiment/ investigation communicate X Direct annotation query/ meta-analysis Agent (human/computer) Community/expert annotation Shh- Absence Of aorta local db influences local db local db submit/ consume Shh bio-entity Shh+ Heart development Participates in Multiple schemas Computational representation
Goal: Turn text-based phenotypes into ontology-based computable annotations • Define a model for representing phenotypes • Develop and extend requisite ontologies • For the entities being described: anatomies, processes, …
Building a suite of orthogonal interoperable reference (evidence based) ontologies in the biomedical domain. It is critical that ontologies are developed cooperatively so that their classification strategies augment one another. Truth springs from arguments amongst friends. (David Hume)
Requisite ontologies • An ontology of qualities (PATO) • Organism specific anatomies • A controlled vocabulary of homologous and analogous anatomical structures (Uberon) • Gene Ontology • Cell Types
Goal: Turn text-based phenotypes into ontology-based computable annotations • Define a model for representing phenotypes • Develop and extend requisite ontologies • For the entities being described: anatomies, processes, … • Develop an intuitive annotation environment for rigorously capturing phenotypes (“semantic authoring”)
Phenote: Simple software for annotating using ontologies • Provide tool for ontology-based annotation • Standardized model to record annotations for increased compatibility of data between disparate communities. • Simple & intuitive user interface • (especially for users that don’t know/care about what an ontology is) • Easy-to-configure for different user-communities • Pluggable architecture for external applications to interface/embed in application • Provide interfaces with external SOAP and REST services for streamlined workflow (OBD, NCBI, EBI, etc). • www.phenote.org
Ontologies can be utilized from various resources in OWL and OBO format BioPortal External site Local file CVS
Post-composition: Join together 2 (or more) terms for specificity: Apoptosis of neuron in skin (GO,CL,FMA) S-phase of colon cancer cell (GO,CL) Aster of human spermatocyte (GO,FMA) Combine terms from different ontologies Increase “information content” of an annotation Pre-composed: Have decomposed definitions of ~2/3rds of MP terms available to incorporate mouse data Refining terms on-the-spot
Retrieve data from NCBI: OMIM, PUBMED, … (SOAP plug-in)
Goal: Turn text-based phenotypes into ontology-based computable annotations • Define a model for representing phenotypes • Develop and extend requisite ontologies • For the entities being described: anatomies, processes, … • Develop an intuitive annotation environment for rigorously capturing phenotypes (“semantic authoring”) • Develop a set of guidelines for biocurators • Annotate mutant phenotypes (OMIM and models)
General Annotation Standards • Remarkable normality • Absence • Relative qualities (what does “small” mean?) • Rates/frequencies • does it inhere in the heart or a process? • Homeotic transformation • Phenotypes specific to a stage or temporal duration
Testing the methodology • Annotated 11 gene-linked human diseases described in OMIM, and their homologs in zebrafish and fruitfly. • ATP2A1, BRODY MYOPATHY • EPB41, ELLIPTOCYTOSIS • EXT2, MULTIPLE EXOSTOSES • EYA1, EYES ABSENT • FECH, PROTOPORPHYRIA • PAX2, RENAL-COLOBOMA SYNDROME • SHH, HOLOPROSENCEPHALY • SOX9, CAMPOMELIC DYSPLASIA • SOX10, PERIPHERAL DEMYELINATING NEUROPATHY • TNNT2, FAMILIAL HYPERTROPHIC CARDIOMYOPATHY • TTN, MUSCULAR DYSTROPHY
Goal: Turn text-based phenotypes into ontology-based computable annotations • Define a model for representing phenotypes • Develop and extend requisite ontologies • For the entities being described: anatomies, processes, … • Develop an intuitive annotation environment for rigorously capturing phenotypes (“semantic authoring”) • Develop a set of guidelines for biocurators • Annotate mutant phenotypes (OMIM and models) • Collect & store annotations in a common resource (OBD) and make these broadly available
4355 genes and genotypes in OBD17782 entity-quality annotations in OBD
OBD model: Requirements • Generic • We can’t define a rigid schema for all of biomedicine • Let the domain ontologies do the modeling of the domain • Expressive • Use cases vary from simple ‘tagging’ to complex descriptions of biological phenomena • Formal semantics • Amenable to logical reasoning • First Order Logic and/or OWL1.1 • Standards-compatible • Integratable with semantic web
OBD Model: overview • Graph-based: nodes and links • Nodes: Classes, instances, relations • Links: Relation instances • Connect subject and object via relation plus additional properties • Annotations: Posited links with attribution / evidence • Equivalent expressivity as RDF and OWL • Links aka axioms and facts in OWL • Attributed links: • Named graphs • Reification • N-ary relation pattern • Supports construction of complex descriptions through graph model
Example ofAnnotation in OBD Post-composition of complex anatomical entity descriptions Post-composition of phenotype classes (PATO EQ formalism) key
OBD Architecture • Two stacks • Semantic web stack • Built using Sesame triplestore + OWLIM • Future iterations: Science-commons Virtuoso • OBD-SQL stack • Current focus • Traditional enterprise architecture • Plugs into Semantic Web stack via D2RQ
Alpha version of API implemented Test clients access via SOAP Phenote current accesses via org.obo model & JDBC Wraps org.obo model and OBD schema Share relational abstraction layer Org.obo wraps OWLAPI Phenote currently connects via JDBC connectivity in org.obo OBD-SQL Stack