260 likes | 401 Views
GO Further. GO annotations. Where do the links between genes and GO terms come from?. GO annotations. Contributing databases: Berkeley Drosophila Genome Project (BDGP ) dictyBase ( Dictyostelium discoideum) FlyBase ( Drosophila melanogaster)
E N D
GO annotations • Where do the links between genes and GO terms come from?
GO annotations • Contributing databases: • Berkeley Drosophila Genome Project (BDGP) • dictyBase (Dictyostelium discoideum) • FlyBase (Drosophila melanogaster) • GeneDB (Schizosaccharomyces pombe, Plasmodium falciparum, Leishmania major and Trypanosoma brucei) • UniProt Knowledgebase (Swiss-Prot/TrEMBL/PIR-PSD) and InterPro databases • Gramene (grains, including rice, Oryza) • Mouse Genome Database (MGD) and Gene Expression Database (GXD) (Mus musculus) • Rat Genome Database (RGD) (Rattus norvegicus) • Reactome • Saccharomyces Genome Database (SGD) (Saccharomyces cerevisiae) • The Arabidopsis Information Resource (TAIR) (Arabidopsis thaliana) • The Institute for Genomic Research (TIGR): databases on several bacterial species • WormBase (Caenorhabditis elegans) • Zebrafish Information Network (ZFIN): (Danio rerio)
Species coverage • All major eukaryotic model organism species • Human via GOA group at UniProt • Several bacterial and parasite species through TIGR and GeneDB at Sanger • many more in pipeline
Anatomy of a GO annotation • Three key parts: • gene name/id • GO term(s) • evidence for association
Example annotation • Breast cancer type 1 susceptibility protein gene in humans
Types of GO annotation: Electronic Annotation Manual Annotation
Manual annotation • Created by scientific curators • High quality • Small number
Manual annotation In this study, we report the isolation and molecular characterization of the B. napus PERK1 cDNA, that is predicted to encode a novel receptor-like kinase. We have shown that like other plant RLKs, the kinase domain of PERK1 has serine/threonine kinase activity, In addition, the location of a PERK1-GTP fusion protein to the plasma membrane supports the prediction that PERK1 is an integral membrane protein…these kinases have been implicated in early stages of wound response…
Electronic Annotation • Annotation derived without human validation • mappings file e.g. interpro2go, ec2go. • Blast search ‘hits’ • Lower ‘quality’ than experimental codes
Fatty acid biosynthesis ( Swiss-Prot Keyword) EC:6.4.1.2 (EC number) IPR000438: Acetyl-CoA carboxylase carboxyl transferase beta subunit (InterPro entry) GO:Fatty acid biosynthesis (GO:0006633) GO:acetyl-CoA carboxylaseactivity (GO:0003989) GO:acetyl-CoA carboxylase activity (GO:0003989) Mappings files
Evidence types • ISS: Inferred from Sequence/structural Similarity • IDA: Inferred from Direct Assay • IPI: Inferred from Physical Interaction • IMP: Inferred from Mutant Phenotype • IGI: Inferred from Genetic Interaction • IEP: Inferred from Expression Pattern • TAS: Traceable Author Statement • NAS: Non-traceable Author Statement • IC: Inferred by Curator • ND: No Data available • IEA: Inferred from electronic annotation
GO terms • Where do GO terms come from? • most GO terms are added by the GO editorial office at EBI • new terms are usually only added when they are asked for by annotators • GO editors work with experts to make major ontology developments • metabolism • pathogenesis • cell cycle
GO stats • almost 20,000 GO terms • 10452 biological_process • 1687 cellular_component • 7393 molecular_function
No GO Areas • GO covers ‘normal’ functions and processes • No pathological processes • No experimental conditions • NO evolutionary relationships • NO gene products • NOT a system of nomenclature
Open Biomedical Ontologies (OBO) • A repository for well-structured controlled vocabularies for shared use across different biological and medical domains: http://obo.sourceforge.net/
Open Biomedical Ontologies (OBO) • Requirements for inclusion: http://obo.sourceforge.net/crit.html
Annotation exercise • We have provided a Nature paper (PMID: 14961121) for you to annotate with GO terms • This will help you to understand how the information is extracted from papers and GO terms are applied by the curators • It will also give you the opportunity to use another GO browser developed at EBI: QuickGO
Annotation exercise • The gene you are annotating is VG5Q • To make it easier we’ve highlighted some of the most relevant passages in the text • Use the GO browser QuickGO to look for the most appropriate GO terms: • http://www.ebi.ac.uk/ego/
Annotation exercise • In QuickGO, you search for the GO terms by name http://www.ebi.ac.uk/ego/
Annotation exercise • Remember, as well as the GO term, you also need to assign an evidence code • to remind you, we’ve included a list of the evidence codes at the back of the paper
Annotation exercise • To see how your annotations compared to those done by the GO curator, search QuickGO for Q8N302 • This is the UniProt id for the gene VG5Q • Click ‘show only manual’ and this will show you the annotations the curator made