1 / 51

Introduction to GO Annotation

Introduction to GO Annotation. Eurie Hong (SGD), Michelle Gwinn (TIGR), Tanya Berardini (TAIR), Karen Pilcher (DictyBase), Russell Collins (FlyBase), Carol Bastiani (Wormbase), Doug Howe (ZFIN), Stacia Engel (SGD). Qualifiers NOT contributes_to colocalizes_with. References.

gailo
Download Presentation

Introduction to GO Annotation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to GO Annotation Eurie Hong (SGD), Michelle Gwinn (TIGR), Tanya Berardini (TAIR), Karen Pilcher (DictyBase), Russell Collins (FlyBase), Carol Bastiani (Wormbase), Doug Howe (ZFIN), Stacia Engel (SGD)

  2. Qualifiers NOT contributes_to colocalizes_with References With/From Supporting evidence for certain evidence codes IMP, IGI, IPI, ISS, IDA, IEP, TAS, NAS, ND, RCA, IC GO Term Evidence code What is a GO annotation? Gene (protein coding gene, functional RNA)

  3. What is an annotation? • Strategies for identifying literature to annotate • Identifying the correct annotation • Molecular Function • Biological Process • Cellular Component • Extent of annotation for a single gene product • Strategies for annotating a genome

  4. Which type of literature is appropriate for annotation? • Papers with experimental evidence for GO process, function or component annotation • Mutant phenotype descriptions • Enzymatic activity assays • Localization studies • Papers describing phylogenetic studies for GO function annotation (ISS) • Reviews • (Textbooks) • (Meeting abstracts)

  5. Strategies for reading a paper for annotation • Abstract • Results/Figures • Materials and Methods • Discussion

  6. Which granularity of GO term is appropriate for annotation? Molecular Function Souza et al. (1998) YakA, a protein kinase required for the transition from growth to development in Dictyostelium. PMID: 9584128

  7. Background • YakA was identified as a developmental mutant • YakA is an ortholog of the yeast Yak1p • The protein kinase domain of YakA is similar to both serine/threonine kinases and tyrosine kinases PMID: 9584128

  8. YakA belongs to the DYRK family YakA is a member of the DYRK family of protein kinases (dual-specificity tyrosine-regulated kinase)

  9. The Experiment • Assay for YakA protein kinase activity • YakA + γ32P-ATP + MBP (substrate) • Look for presence of 32P in substrate in the presence of YakA PMID: 9584128

  10. The Result PMID: 9584128

  11. GO Term for Annotation protein kinase activity ; GO:0004672 • MBP (myelin basic protein) is a generic substrate • Kinase specificity not determined; no phospho-tyrosine antibodies used, for example Definition: Catalysis of the transfer of a phosphate group, usually from ATP, to a protein substrate.

  12. Searching for Terms in DAG-Edit Search term name that contains: • kinase 359 results • protein kinase 60 results • protein kinase activity 20 results

  13. Search Output in DAG-Edit

  14. Sibling Terms in DAG-Edit

  15. Child Terms in DAG-Edit

  16. Parent Terms in AmiGO

  17. Evidence Code • The evidence code for the protein kinase activity term is IDA (Inferred from Direct Assay) • Although endogenous substrates were not tested, the authors clearly showed kinase activity with a direct assay

  18. Granular Terms Using ISS (Inferred from Sequence or structural Similarity) protein serine/threonine kinase activity ; GO:0004674 protein tyrosine kinase activity ; GO:0004713

  19. Molecular Function… “Elemental activities, such as catalysis or binding, describing the actions of a gene product at the molecular level. A given gene product may exhibit one or more molecular functions.” Biological Process… “A phenomenon marked by changes that lead to a particular result, mediated by one or more gene products.” is about the organism. is about the protein. are the activities that a protein specifically and directly does. are the organism uses those activities for. for example A hammer hammers nails… and builds houses. Rho1 has GTPase activity… and the organism uses that activity for gastrulation, axon guidance, germ cell migration, etc … How is Biological Process different form Molecular Function?

  20. Important points: Process is a migration of germ (pole) cells. It is the movement of cells from one side of the epithelium to the other. It is one step in a three step process.

  21. Is a new term needed?

  22. New term might be appropriate because it would describe a discrete, separable process, thus providing additional useful information to the user. Also, a new term(s) permit linking two similar processes that are currently separate in GO, but are connected in the literature. cell migration (is a) transepithelial cell migration (is a) pole cell transepithelial migration (is a) cellular extravasation cell migration (is a) germ cell migration (is a) pole cell migration (part of) pole cell transepithelial migration

  23. Annotating to the Cellular Component Ontology Carol Bastiani, Caltech

  24. Experiment: Immunolocalization of LIN-10 with a LIN-10 antibody.

  25. Localization of LIN-10 by Immunoflourescence: Vulval epithelial cells can be distinguished from ventral cord neurons by their larger size and the presence of stained cell junctions (red)

  26. Figure 7.   LIN-10 is expressed in neurons. (A-C) Wild-type, late L3 hermaphrodite stained with anti-LIN-10 antibodies (green). LIN-10 is present in ventral cord processes (A, *), lateral neural cell bodies and processes (A and B, arrowheads), and dorsal cord processes

  27. Search MGI GO Browser for neuron:

  28. Choosing the evidence code:

  29. Further subcellular localization of LIN-10: In neural cell bodies, a small amount of LIN-10appears diffusely throughout the cytoplasm, whereas the majorityof LIN-10 is concentrated in discrete perinuclear structures (Figure7, D and E), similar to perinuclear structures observed in vulvalepithelial cells. To determine whether these perinuclear structurescorrespond to Golgi, we used ST-GFP as a marker for the trans-cisternaof the Golgi (Jamora et al., 1997). We expressed ST-GFP in transgenicworms using a heat shock promoter and examined the subcellularlocalization of LIN-10 and ST-GFP using anti-LIN-10 and anti-GFPantibodies. In single neurons expressing both endogenous LIN-10and transgenic ST-GFP, the subcellular pattern of LIN-10 stainingis similar to that of ST-GFP staining. Deconvolution of imagesobtained in double-staining experiments revealed that LIN-10 stainingis closely associated with ST-GFP staining (Figure 7, F-I), butLIN-10 staining is consistently offset (by 0.2-0.5 µm) from ST-GFPstaining. These results indicate that LIN-10 is localized in thetrans-cisterna of the Golgi or is localized in a compartment closelyassociated with the trans-cisterna, such as the trans-Golgi network.

  30. LIN-10 is localized to: 1) Cytoplasm 2) Within or in association with a part of the Golgi apparatus/ in close association with the trans-cisterna or trans-Golgi network

  31. 1) Annotate to cytoplasm:

  32. LIN-10 is localized to: 1) Cytoplasm 2) Within or in association with a part of the Golgi apparatus/ in close association with the trans-cisterna or trans-Golgi network

  33. 2)Annotate to Golgi apparatus, evidence code IDA:

  34. Qualifier to use “when the resolution of the assay is not accurate enough to say that the gene product is a bona fida component member:”

  35. How to get a complete set of GO annotations Updating GO annotations Representative approaches Strategies for annotation of a genome

  36. Complete a first pass For all 3 aspects (MF, BP, CC) For all genes that get GO annotations Proteins, RNAs, pseudogenes NOT centromeres, telomeres, LTRs, retrotransposons, ARSs Unknowns are allowed Strategies for annotation of a genome How to get a complete set of GO annotations

  37. Second pass Replace unknowns Update where IEA was used Info with “better” evidence code, if available Update where other db’s are referenced Primary literature is preferred Strategies for annotation of a genome Updating the complete set of GO annotations

  38. GO annotations will never be “done” Part of normal curation process More specific information Better evidence code Replace obsolete terms “Last reviewed” date Strategies for annotation of a genome Updating GO annotations - ongoing

  39. Strategies for annotation of a genome Updating GO annotations - ongoing

More Related