480 likes | 496 Views
GO and OBO:. an introduction. What is the Gene Ontology? What is OBO? OBO-Edit demo & practical. Gene Ontology. Built for a very specific purpose: “annotation of genes and proteins in genomic and protein databases” Applicable to all species. Evolution of GO. Original GO created in 2000
E N D
GO and OBO: an introduction
What is the Gene Ontology? • What is OBO? • OBO-Edit demo & practical Jane Lomax EMBL-EBI
Gene Ontology • Built for a very specific purpose: “annotation of genes and proteins in genomic and protein databases” • Applicable to all species Jane Lomax EMBL-EBI
Evolution of GO • Original GO created in 2000 • Three databases involved: • FlyBase (Drosophila) • MGI (Mouse) • SGD (S. cerevisae) • Used immediately Jane Lomax EMBL-EBI
Evolution of GO • Later databases: • TAIR (Arabadopsis) • TIGR (microbes including prokaryotes) • SWISS-PROT (several thousand species inc. human) • PSU (P. falciparum) • Recent additions • ZFIN (zebrafish) • PAMGO (plant pathogens) Jane Lomax EMBL-EBI
Evolution of GO • GO development traditionally annotation-driven • development directed by use • Terms added as new species annotated • Terms added on as as-needed basis Jane Lomax EMBL-EBI
Evolution of GO • Developed by an international consortium of biologists and computer scientists • members from individual databases • central office at EBI • Development involves collaboration with domain experts from different biological fields • also formal ontologists Jane Lomax EMBL-EBI
Evolution of GO • Resulted in ‘organic’ structure, little formality • Ontological formality added subsequently • philosophical and logical Jane Lomax EMBL-EBI
Growth of GO Jane Lomax EMBL-EBI
How does GO work? • What does the gene product do? • Where and when does it act? • Why does it perform these activities? What information might we want to capture about a gene product? Jane Lomax EMBL-EBI
GO structure • GO terms divided into three parts: • cellular component • molecular function • biological process Jane Lomax EMBL-EBI
Cellular Component • where a gene product acts
Cellular Component • Enzyme complexes in the component ontology refer to places, not activities.
Molecular Function • activities or “jobs” of a gene product glucose-6-phosphate isomerase activity
Molecular Function insulin binding insulin receptor activity
Molecular Function drug transporter activity
Molecular Function • A gene product may have several functions; a function term refers to a single reaction or activity, not a gene product. • Sets of functions make up a biological process. Jane Lomax EMBL-EBI
Biological Process a commonly recognized series of events cell division
Biological Process transcription
Biological Process regulation of gluconeogenesis
Biological Process limb development
Biological Process courtship behavior
Ontology Structure • Terms are linked by two relationships • is-a • part-of Jane Lomax EMBL-EBI
cell membrane chloroplast mitochondrial chloroplast membrane membrane is-a part-of Ontology Structure Jane Lomax EMBL-EBI
Ontology Structure • Ontologies are structured as a hierarchical directed acyclic graph (DAG) • Terms can have more than one parent and zero, one or more children Jane Lomax EMBL-EBI
Ontology Structure Directed Acyclic Graph (DAG) - multiple parentage allowed cell membrane chloroplast mitochondrial chloroplast membrane membrane Jane Lomax EMBL-EBI
Open Biomedical Ontologies (OBO) • GO is a member of OBO • An umbrella project for grouping different ontologies in biological/medical field • a repository for ontologies with defined set of standards • Available from a single source: http://obo.sourceforge.net/ Jane Lomax EMBL-EBI
Why do we need OBO? • GO covers small area of biology: • molecular function of a protein • biological function of a protein • cellular location of a protein Jane Lomax EMBL-EBI
Why do we need OBO? • Lots of other aspects that also need to be captured, e.g.: • phenotype • anatomy • genomic • taxonomy Jane Lomax EMBL-EBI
Why do we need OBO? • Many groups develop their own ontologies • e.g. plant ontology, anatomies for specific organisms • No standardisation of ontologies with respect to: • format • scope • relationships • No way of knowing whether such ontologies already exist • No mechanism of distribution for other groups Jane Lomax EMBL-EBI
Why do we need OBO? • Creating ontologies takes a lot of work • Makes sense to reuse existing ontologies where possible • Improves data integration where small set of ontologies used • Allows ontologies to be made available from a single place Jane Lomax EMBL-EBI
Why do we need OBO? • Ultimate aim: a complete set of integrated ontologies completely covering the biomedical domain Jane Lomax EMBL-EBI
OBO requirements To be part of OBO, ontologies must: • Be open, can be used by all without any constraint Jane Lomax EMBL-EBI
OBO requirements: open • Ontologies can be used by anyone without any constraints, except: • original authors are acknowledged • cannot be edited and then released under same name Jane Lomax EMBL-EBI
OBO requirements To be part of OBO, ontologies must: • Be open, can be used by all without any constraint • Be in a common shared syntax Jane Lomax EMBL-EBI
OBO requirements: syntax • Usually the OBO format, same as primary GO format • and adaptions of OBO format • Also accept OWL (Web Ontology Language) format • Allows the same tools to be applied, facilitating shared software implementations Jane Lomax EMBL-EBI
Anatomy of an OBO term unique ID id: GO:0006094 name: gluconeogenesis namespace: process def: The formation of glucose from noncarbohydrate precursors, such as pyruvate, amino acids and glycerol. [http://cancerweb.ncl.ac.uk/omd/index.html] exact_synonym: glucose biosynthesis xref_analog: MetaCyc:GLUCONEO-PWY is_a: GO:0006006 is_a: GO:0006092 term name ontology definition synonym database ref parentage Jane Lomax EMBL-EBI
OBO requirements To be part of OBO, ontologies must: • Be open, can be used by all without any constraint • Be in a common shared syntax • Not overlap with other ontologies in OBO Jane Lomax EMBL-EBI
OBO requirements: overlapping • Ontologies can (and should) overlap partially, but large overlap should be avoided • Idea is that terms from different ontologies can be combined to form new terms • Striving for accepted standards rather than competition Jane Lomax EMBL-EBI
OBO requirements To be part of OBO, ontologies must: • Be open, can be used by all without any constraint • Be in a common shared syntax • Not overlap with other ontologies in OBO • Share a unique identifier space Jane Lomax EMBL-EBI
OBO requirements: id space • So, for example, the GO identifier is “GO”: • No other OBO ontology could use this id space • Prevents problems where multiple ontologies are used together Jane Lomax EMBL-EBI
OBO requirements To be part of OBO, ontologies must: • Be open, can be used by all without any constraint • Be in a common shared syntax • Not overlap with other ontologies in OBO • Share a unique identifier space • Include text definitions of their terms Jane Lomax EMBL-EBI
OBO requirements • In addition, OBO includes ontology of relationships • all ontologies should use these definitions of relationships • For example • part_of • develops_from • regulates Jane Lomax EMBL-EBI
What’s available • demo: http://obo.sourceforge.net/ Jane Lomax EMBL-EBI
Editing ontologies • GO is edited using OBO-Edit • stand-alone Java application • available for all platforms • browse, create or edit any ontology in OBO format Jane Lomax EMBL-EBI
OBO-Edit demo • Browsing ontologies • loading ontologies (including loading multiple ontologies) • graph viewer • reasoner/single relationship views • searching/filtering/rendering • help • Creating/editing ontologies • creating a new ontology • adding terms • copying/moving/deleting terms • adding definitions, dbxrefs etc • verification plugin • saving ontologies Jane Lomax EMBL-EBI