1 / 26

GO Further

GO Further. GO annotations. Where do the links between genes and GO terms come from?. GO annotations. Contributing databases: Berkeley Drosophila Genome Project (BDGP ) dictyBase ( Dictyostelium discoideum) FlyBase ( Drosophila melanogaster)

nayef
Download Presentation

GO Further

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GO Further

  2. GO annotations • Where do the links between genes and GO terms come from?

  3. GO annotations • Contributing databases: • Berkeley Drosophila Genome Project (BDGP) • dictyBase (Dictyostelium discoideum) • FlyBase (Drosophila melanogaster) • GeneDB (Schizosaccharomyces pombe, Plasmodium falciparum, Leishmania major and Trypanosoma brucei) • UniProt Knowledgebase (Swiss-Prot/TrEMBL/PIR-PSD) and InterPro databases • Gramene (grains, including rice, Oryza) • Mouse Genome Database (MGD) and Gene Expression Database (GXD) (Mus musculus) • Rat Genome Database (RGD) (Rattus norvegicus) • Reactome • Saccharomyces Genome Database (SGD) (Saccharomyces cerevisiae) • The Arabidopsis Information Resource (TAIR) (Arabidopsis thaliana) • The Institute for Genomic Research (TIGR): databases on several bacterial species • WormBase (Caenorhabditis elegans) • Zebrafish Information Network (ZFIN): (Danio rerio)

  4. Species coverage • All major eukaryotic model organism species • Human via GOA group at UniProt • Several bacterial and parasite species through TIGR and GeneDB at Sanger • many more in pipeline

  5. Annotation coverage

  6. Anatomy of a GO annotation • Three key parts: • gene name/id • GO term(s) • evidence for association

  7. Example annotation • Breast cancer type 1 susceptibility protein gene in humans

  8. Types of GO annotation: Electronic Annotation Manual Annotation

  9. Manual annotation • Created by scientific curators • High quality • Small number

  10. Manual annotation In this study, we report the isolation and molecular characterization of the B. napus PERK1 cDNA, that is predicted to encode a novel receptor-like kinase. We have shown that like other plant RLKs, the kinase domain of PERK1 has serine/threonine kinase activity, In addition, the location of a PERK1-GTP fusion protein to the plasma membrane supports the prediction that PERK1 is an integral membrane protein…these kinases have been implicated in early stages of wound response…

  11. Manual annotation

  12. Electronic Annotation • Annotation derived without human validation • mappings file e.g. interpro2go, ec2go. • Blast search ‘hits’ • Lower ‘quality’ than experimental codes

  13. Fatty acid biosynthesis ( Swiss-Prot Keyword) EC:6.4.1.2 (EC number) IPR000438: Acetyl-CoA carboxylase carboxyl transferase beta subunit (InterPro entry) GO:Fatty acid biosynthesis (GO:0006633) GO:acetyl-CoA carboxylaseactivity (GO:0003989) GO:acetyl-CoA carboxylase activity (GO:0003989) Mappings files

  14. Evidence types • ISS: Inferred from Sequence/structural Similarity • IDA: Inferred from Direct Assay • IPI: Inferred from Physical Interaction • IMP: Inferred from Mutant Phenotype • IGI: Inferred from Genetic Interaction • IEP: Inferred from Expression Pattern • TAS: Traceable Author Statement • NAS: Non-traceable Author Statement • IC: Inferred by Curator • ND: No Data available • IEA: Inferred from electronic annotation

  15. GO terms • Where do GO terms come from? • most GO terms are added by the GO editorial office at EBI • new terms are usually only added when they are asked for by annotators • GO editors work with experts to make major ontology developments • metabolism • pathogenesis • cell cycle

  16. GO stats • almost 20,000 GO terms • 10452 biological_process • 1687 cellular_component • 7393 molecular_function

  17. Growth of GO

  18. No GO Areas • GO covers ‘normal’ functions and processes • No pathological processes • No experimental conditions • NO evolutionary relationships • NO gene products • NOT a system of nomenclature

  19. Open Biomedical Ontologies (OBO) • A repository for well-structured controlled vocabularies for shared use across different biological and medical domains: http://obo.sourceforge.net/

  20. Open Biomedical Ontologies (OBO) • Requirements for inclusion: http://obo.sourceforge.net/crit.html

  21. AmiGO exercise

  22. Annotation exercise • We have provided a Nature paper (PMID: 14961121) for you to annotate with GO terms • This will help you to understand how the information is extracted from papers and GO terms are applied by the curators • It will also give you the opportunity to use another GO browser developed at EBI: QuickGO

  23. Annotation exercise • The gene you are annotating is VG5Q • To make it easier we’ve highlighted some of the most relevant passages in the text • Use the GO browser QuickGO to look for the most appropriate GO terms: • http://www.ebi.ac.uk/ego/

  24. Annotation exercise • In QuickGO, you search for the GO terms by name http://www.ebi.ac.uk/ego/

  25. Annotation exercise • Remember, as well as the GO term, you also need to assign an evidence code • to remind you, we’ve included a list of the evidence codes at the back of the paper

  26. Annotation exercise • To see how your annotations compared to those done by the GO curator, search QuickGO for Q8N302 • This is the UniProt id for the gene VG5Q • Click ‘show only manual’ and this will show you the annotations the curator made

More Related