350 likes | 585 Views
ASPB Plant Biology, June 29, 2008, Merida. Gene Ontology and Functional Annotation. Donghui Li. TAIR literature statistics. O utline. Functional annotation Controlled vocabularies: GO and PO Functional annotation at TAIR Community annotation. Functional annotation.
E N D
ASPB Plant Biology, June 29, 2008, Merida Gene Ontology and Functional Annotation Donghui Li
Outline • Functional annotation • Controlled vocabularies: GO and PO • Functional annotation at TAIR • Community annotation
Functional annotation • is defined as the process of collecting information about a gene’s biological identity: • molecular function (protein kinase) • biological roles (protein phosphorylation) • subcellular localization (cytoplasm) • aliases • mutant phenotype • expression domain
What is an annotation? An annotation is a statement that a gene product … …has a particular molecular function …is involved in a particular biological process …is located within a certain cellular component …as determined by a particular method …as described in a particular reference Adapted from Harold J Drabkin, The Jackson Laboratory
Evidence code Gene product Reference Smith et al. (2006) determined by anenzyme assay that Abc2 has protein kinase activity, is involved in the process of protein phosphorylation, and is located in the cytoplasm. Controlled vocabularies Adapted from Harold J Drabkin, The Jackson Laboratory
Controlled vocabulary (CV) • Non-controlled vocabulary • same name, different concept • different name, same concept • Controlled vocabulary • A standardized restricted set of defined terms designed to reduce ambiguity in describing a concept
Same name, different concept germination seed germination pollen germination spore germination
Different name, same concept • glucose biosynthesis • glucose synthesis • glucose formation • glucose anabolism • gluconeogenesis noncarbohydrate precursors (pyruvate, amino acids and glycerol) glucose protein formation translation = protein biosynthesis (3Z)-phytochromobilin + oxidized ferredoxin = biliverdin IXa + reduced ferredoxin. (EC:1.3.7.4) phytochromobilin synthase activity = phytochromobilin:ferredoxin oxidoreductase activity
Cross-species cross-database comparison is problematic without CV • translation • protein biosynthesis • phytochromobilin synthase activity • phytochromobilin:ferredoxin oxidoreductase activity
Cross-species cross-database comparison is problematic without CV germination seed germination pollen germination spore germination pollen spore
Controlled vocabularies used by TAIR GO: The Gene Ontology, Gene Ontology Consortium PO: The Plant Ontology, Plant Ontology Consortium
Gene Ontology molecular function: catalytic / binding activities kinase activity, DNA binding activity transcriptional factor biological process: biological goal or objective signal transduction mitosis, purine metabolism cellular component: location or complex nucleus ribosome, proteasome
Ontology structure: directed acyclic graph (DAG) parent 1 parent 2 child DAG: each child may have one or more parents
Ontology structure: directed acyclic graph (DAG) protein complex organelle mitochondrion fatty acid beta-oxidation multienzyme complex
Ontology structure: term-term relationships protein complex organelle is-a is-a mitochondrion part-of fatty acid beta-oxidation multienzyme complex
Gene ontology browser: AmiGO http://www.geneontology.org http://amigo.geneontology.org
Plant Ontology Plant structure morphological and anatomical structures stamen, petal, guard cell Growth and developmental stages whole plant growth stages and plant structure developmental stages seedling growth, rosette growth, leaf development stages, embryo development stages
How are annotations made? The Plant Journal (2006) 47:701 gene AT5G27620 GO:0004672 protein kinase activity term evidence kinase assay association
Evidence codes Experimental evidence codes EXP - Inferred from Experiment IMP -Inferred from Mutant Phenotype IDA -Inferred from Direct Assay IGI - Inferred from Genetic Interaction IPI -Inferred from Physical Interaction IEP -Inferred from Expression Pattern Computational analysis evidence codes ISS -Inferred from Sequence or structural Similarity
Functional annotation of Arabidopsis genome using GO Known Unknown Known, EXP Unannotated May 2008
TAIR - Plant Physiology collaboration • Author submits annotation after the paper is accepted • Web-based interface • AGI locus identifier (At1g01040) • Gene function annotation linked to loci with method • Will expand to include other journals (Plant Cell ...)
Functional annotation submission form curator@arabidopsis.org