1 / 61

Suzanna Lewis GO Consortium & National Center for Biomedical Ontology geneontology/

GO, NCBO, Phenotypes, & the OBO Foundry January 29 th , 2007 Ontologies for Biomedical Investigations La Jolla Institute for Allergy and Immunology. Suzanna Lewis GO Consortium & National Center for Biomedical Ontology http://www.geneontology.org/ http://www.bioontology.org/. Outline.

latona
Download Presentation

Suzanna Lewis GO Consortium & National Center for Biomedical Ontology geneontology/

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GO, NCBO, Phenotypes, & the OBO FoundryJanuary 29th, 2007Ontologies for Biomedical InvestigationsLa Jolla Institute for Allergy and Immunology Suzanna Lewis GO Consortium & National Center for Biomedical Ontology http://www.geneontology.org/ http://www.bioontology.org/

  2. Outline • Perspective on the challenge of our mutual investment of time and effort on standards, formalisms, and representation • GO case study retrospective • NCBO today • Phenotype case study • OBO-Foundry

  3. The Scientific Method • A body of techniques for investigating phenomena and acquiring new knowledge, as well as for correcting and integrating previous knowledge. It is based on observable, empirical, measurable evidence, and subject to rules of reasoning. • Isaac Newton (1687, 1713, 1726). "Rules for the study of natural philosophy", Philosophiae Naturalis Principia Mathematica, Book 3, The System of the World. Third edition, the 4 rules as reprinted on pages 794-796 of I. Bernard Cohen and Anne Whitman's 1999 translation, University of California Press ISBN 0-520-08817-4, 974 pages.

  4. Today’s data is in electronic form • Rules of reasoning are an intrinsic element of scientific investigation • In our current era, data reside in electronic form • Building and using computable ontologies will support rules of reasoning on our data • And thereby support research in the computer age.

  5. Necessary Character of a computational environment for biological research • Sustainable • There must be mechanism for maintaining the environment (that is less than the initial cost). • Adaptable • It must work for the complete spectrum of data types, from genomics to clinical trials • It must continually adapt to new knowledge and new technologies • Interoperable • We need the capability of easily integrating data from a variety of sources. • Evolvable • Mechanisms must be put in place to respond to the needs of the biomedical research community. They provide the primary selection pressure on the evolution of the technology.

  6. Clarity of Vision/Goal Political Landscape needs to support the goal Decision-Process needs to be responsive and efficient Message has to be brought to the community Accountability (i.e. no vaporware) Feasible within available resources Providing incentives for adoption Tactics must change with the adoption curve Sustaining effort over multiple years Being satisfied with highly imperfect, but pragmatic solutions. 10 Factors in achieving goals Credit to John Glaser

  7. Criteria for success, & signs of failure • Measurable evidence of improved productivity and efficiency (time saved vs. number of users for given output) • Evidence of learning from experience • e.g., sustained improvements in content volume and quality) • Evidence of discoverability - enables positive outputs that were unanticipated • Continual, iterative process that occurs at each stage of development and growth, fully integrated into the lifecycle • Evidence of community acceptance, that more data is being provided continuously • Contains sufficient information to supports reproducibility of results • Negotiation of meaning has occurred • Stymied by barriers regarding IP, and credit attribution • High relative cost for maintenance, support, and boosting interest • Problems that occur between the cracks • The criteria for success listed above are not being met

  8. GO case study "For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled.” Richard Feynman (1962)

  9. Three fundamental dichotomies • types vs. instances • continuants vs. occurrents • dependent vs. independent

  10. occurent dependent continuant independent For example, in the GO’s 3 ontologies molecular function biological process cellular component Molecules, cell components , organisms are independent continuants which have functions (these are dependent continuants), and these functions may be realized as an occurent process when “functioning”

  11. Specific Aims of the GO 2006 • We will maintain comprehensive, logically rigorous and biologically accurate ontologies. • We will comprehensively annotate 9 reference genomes in as complete detail as possible. • We will support annotation across all organisms. • We will provide our annotations and tools to the research community.

  12. Weaving and untangling the GO • Missing relations • is_a completeness • Adding new relations within single GO ontology • Adding “regulates” to BP • Distinguishing different part_of relations • Adding Relations between GO axis • Linking between MF & BP & CC • Adding relations between GO & other ontologies • GO+Cell • GO+anatomy • GO+ChEBI

  13. Implicit ontologies within the GO: • cysteine biosynthesis (ChEBI) • myoblast fusion (Cell Type Ontology) • hydrogen ion transporter activity (ChEBI) • snoRNA catabolism (Sequence Ontology) • wing disc pattern formation (Drosophila anatomy) • epidermal cell differentiation (Cell Type Ontology) • regulation of flower development (Plant anatomy) • interleukin-18 receptor complex (not yet in OBO) • B-cell differentiation (Cell Type Ontology)

  14. Relations to Other Ontologies CL GO blood cell cell differentiation lymphocyte differentiation lymphocyte B-cell activation B-cell is_a B-cell differentiation

  15. CELL Ontology [Term] id: CL:0000236 name: B-cell is_a: CL:0000542 ! lymphocyte develops_from: CL:0000231 ! B-lymphoblast Augmented GO [Term] id: GO:0030183 name: B-cell differentiation is_a: GO:0042113 ! B-cell activation is_a: GO:0030098 ! lymphocyte differentiation intersection_of: is_a GO:0030154 ! cell differentiation intersection_of: has_participant CL:0000236 ! B-cell

  16. Correlation of mRNA decay rates with (GO) function Genome Research 13:1863-1872, 2003 Decay Rates of Human mRNAs: Correlation With Functional Characteristics and Sequence Attributes. E. Yang, E. van Nimwegen, M. Zavolan, N. Rajewsky, M. Schroeder, M Magnasco and JE Darnell, Jr

  17. How GO measures up • Measurable evidence of improved productivity and efficiency • Researchers simply use the GO, as judged by publications • Evidence of learning from experience • Formalism of the GO continues to improve • Evidence of discoverability - enables positive outputs that were unanticipated • Primary use of GO is cluster analysis of microarray expression data • Continual, iterative process that occurs at each stage of development and growth, fully integrated into the lifecycle • Quarterly updates of software • Evidence of community acceptance, that more data is being provided continuously • Number of species continues to increase

  18. The National Center for Biomedical Ontology BioPortal Phenotype Annotation

  19. NCBO’s 7 Cores • Core 1: Computer science • Core 2: Bioinformatics • Core 3: Driving biological projects • Core 4: Infrastructure • Core 5: Education and Training • Core 6: Dissemination • Core 7: Administration

  20. Who NCBO is • Stanford: Tools for ontology alignment, indexing, and management (Cores 1, 4–7: Mark Musen) • Lawrence–Berkeley Labs: Tools to use ontologies for data annotation (Cores 2, 5–7: Suzanna Lewis) • Mayo Clinic: Tools for access to large controlled terminologies (Core 1: Chris Chute) • Victoria: Tools for ontology and data visualization (Cores 1 and 2: Margaret-Anne Story) • University at Buffalo: Dissemination of best practices for ontology engineering (Core 6: Barry Smith)

  21. NCBO Driving Biological Projects • Trial Bank: UCSF, Ida Sim • Flybase: Cambridge, Michael Ashburner • ZFIN: Oregon, Monte Westerfield

  22. BioPortal • Indexes, searches and visualizes terms in ontologies in library • Uses LexGrid (Mayo) • Contains ontologies that their editors have released to BioPortal

  23. The BioPortal Needs You! • We need, and beg and plead for, your feedback • http://www.bioontology.org/ncbo/faces/index.xhtml • For example: Providing URIs for all ontologies and/or ontology content? • Tomorrow depends on you, no request is too mundane.

  24. Phenotype work in progress

  25. Animal disease models Animal models Mutant Gene Mutant or missing ProteinMutant Phenotype

  26. Animal disease models Humans Animal models Mutant Gene Mutant or missing ProteinMutant Phenotype (disease) Mutant Gene Mutant or missing ProteinMutant Phenotype (disease model)

  27. Animal disease models Humans Animal models Mutant Gene Mutant or missing ProteinMutant Phenotype (disease) Mutant Gene Mutant or missing ProteinMutant Phenotype (disease model)

  28. Animal disease models Humans Animal models Mutant Gene Mutant or missing ProteinMutant Phenotype (disease) Mutant Gene Mutant or missing ProteinMutant Phenotype (disease model)

  29. SHH-/+ SHH-/- shh-/+ shh-/-

  30. Phenotype (clinical sign) = entity + quality

  31. Phenotype (clinical sign) = entity + quality P1 = eye + hypoteloric

  32. Phenotype (clinical sign) = entity + quality P1 = eye + hypoteloric P2 = midface + hypoplastic

  33. Phenotype (clinical sign) = entity + quality P1 = eye + hypoteloric P2 = midface + hypoplastic P3 = kidney + hypertrophied

  34. Phenotype (clinical sign) = entity + quality P1 = eye + hypoteloric P2 = midface + hypoplastic P3 = kidney + hypertrophied PATO: hypoteloric hypoplastic hypertrophied ZFIN: eye midface kidney +

  35. Phenotype (clinical sign) = entity + quality Anatomical ontology Cell & tissue ontology Developmental ontology Gene ontology biological process cellular component + PATO (phenotype and trait ontology)

  36. Phenotype (clinical sign) = entity + quality P1 = eye + hypoteloric P2 = midface + hypoplastic P3 = kidney + hypertrophied Syndrome = P1 + P2 + P3 (disease) = holoprosencephaly

  37. Human holo- prosencephaly Zebrafish shh Zebrafish oep

  38. TIGR, December 6-7, 2002

  39. PaTO upper level • Unifying goal: Integrating data • within and across domains (e.g. different taxa) • across levels of granularity • across different perspectives • Requires • Rigorous formal definitions in both ontologies and annotation schemas

  40. Top level PaTO division:spatial vs temporal Note: some nodes omitted for brevity Quality Quality of a continuant A quality which inheres In a continuant Quality of an occurrent A quality which inheres In a process or spatiotemporal region physical quality cellular quality morphology duration color density shape size structure arrested premature delayed

  41. Top level PaTO division: Granularity Monadic quality of a continuant … Physical quality A quality that exists through action of continuants at the physical level of organisation Cellular quality A quality that exists at the cellular level of organisation … nucleate quality ploidy potency color temperature mass green diploid multipotent large mass pink hot haploid totipotent anucleate small mass cold yellow aneuploid oligoptent binculeate

  42. Monadic vs. relational quality of a continuant … Monadic quality of a C A quality of a C that inheres solely in the bearer and does not require another entity Relational quality of a C A quality of a C that requires another entity apart from its bearer to exist … Sensitivity (to) Displacement (with) Connected-ness (to) Physical quality Cellular quality morphology shape size structure

  43. Relational qualities involving the environment • “drought sensitivity” [TO:0000029] • Directed towards an additional entity type • Q= PATO:sensitivity E2= EO:drought Def: asensitivitywhichis directedtowardsdrought [ inheres_inorganism ] OBO needs a good environment ontology

  44. What is Phenote? • A tool for annotating Phenotypes • Curator reads about a phenotype in the literature related to taxonomy or genotype • Curator enters genotype(or taxonomy) • Curator enters genetic context (optional) • Curator searches/enters Entity (e.g. Anatomy) • Curator searches/enters PATO attribute/value

  45. Phenote

  46. ZFIN integration Also Phenote

  47. Anatomy Cell Chemical Drug Disease Environmental context . . . Qualifier Unit GO - biological process GO - molecular function GO - cellular component Other ontologies…

  48. OBO-Foundry

More Related