170 likes | 455 Views
Manually curated and computationally predicted GO annotations at the Saccharomyces Genome Database. Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine. http://www.yeastgenome.org/. Scientific community. Data from high through-put experiments.
E N D
Manually curated and computationally predicted GO annotations at the Saccharomyces Genome Database Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine http://www.yeastgenome.org/
Scientific community Data from high through-put experiments Data from traditional experiments Integrated data Analysis tools Genome sequence
CHS6/YJL099W Locus Summary Page Nomenclature Summary of published data Links to SGD tools and other databases Curated data from published literature Sequence Information Data from high throughput experiments Links to other databases
Accessing the data via files ftp://ftp.yeastgenome.org/yeast/
Genes without published characterization data Molecular Function 2112 genes (33.6% of all genes) Biological Process 1448 genes (23.0% of all genes) Cellular Component 864 genes (13.7% of all genes) from Genome Snapshot 8/23/2006 Status of GO Annotations at SGD All protein and RNA gene products have been annotated with GO terms All GO annotations are manually curated from literature (no IEA)
Integrated analysis of multiple datasets source: publications, external databases Sources of Computationally Predicted GO Annotations InterPro domain matches in S. cerevisiae proteins source: GOA project
CHS6/YJL099W GO Annotation Page { Core GO Annotations { GO Annotations from Large Scale Experiments { Computationally Predicted GO Annotations
{ Current functionality { Specify background set { Refine annotations used by annotation source or evidence codes Changes to GO Term Finder
Computationally predicted GO annotations Manually curated GO annotations Improving GO Annotations Computational predictions may indicate publications that were overlooked Review inconsistencies between computationally predicted and manually curated GO annotations to improve mappings and manually curated annotations Review inconsistencies between computationally predicted and manually curated GO annotations to improve ontology
Additional Annotations Using Interpro2GO Information added to genes with no published characterization data from gene_association.goa_uniprot 7/2006
Interpro2go annotation is ancestor of curated annotation Shared parent is root term 2% 43% Shared parent is child of root term Other 38% 18% Interpro2go annotation for an unknown Interpro2go annotation matches curated annotation Other shared parent term 4% 18% 15% 5946 IEA 9059 IC+IDA+IEP+IGI+IMP+IPI+ISS+NAS+RCA+TAS Preliminary Comparison: Cellular Component Annotations
Summary Currently, all GO annotations for S. cerevisiae gene products are manually curated from literature SGD will incorporate computationally predicted GO annotations that will provide additional information for a gene product’s role in biology Computationally predicted GO annotations will be used to refine and improve manually curated GO annotations at SGD