1 / 14

Insight into GO and GOA

Insight into GO and GOA. Angelica Tulipano , INFN Bari CNR Giulia De Sario , ITB Bari CNR Andreas Gisel, ITB Bari CNR. EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov -9.Nov 2007. GODB.

amie
Download Presentation

Insight into GO and GOA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Insight into GO and GOA Angelica Tulipano , INFN Bari CNR Giulia De Sario , ITB Bari CNR Andreas Gisel, ITB Bari CNR EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov -9.Nov 2007

  2. GODB The GO Database, which comprises both ontology and annotation data, is built from the flat files available on the GO website, and can be downloaded in mySQL or RDF XML format. • termdb • ontologies, definitions and mappings to other dbs • assocdb • the above, plus associations to gene products • seqdb • the above, plus protein sequences for some of the gene products • seqdblite • the above, with IEA associations stripped out (this is the version that drives AmiGO) GO_200709 3,3 million gene products more than 100000 organisms 25000 GO terms 14,5 million associations EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

  3. Total terms 24955 Number of terms without child 14974 (60%) Number of terms with children 10019 (40%) GO tree - path EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

  4. GO tree - path Total terms 24955 Number of terms without child 14974 Number of terms with children 10019 Number of different path 261034 Average number of path / end term 17 Max number of path / end term 851 EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

  5. GO tree - path Total terms 24955 Number of terms without child 14974 Number of terms with children 10019 Average length of path 10 Max length of path 18 Average length of path 10 Max length of path 18 EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

  6. GO tree - path The GO is very wide and has a large knowledge to associate with gene products, however the depth of the path is quite short EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

  7. Gene product description BCL2_HUMAN 3 million gene products (UniProt) are described by 47636descriptions EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

  8. Gene product description BCL2_HUMAN GODB version go_09_07 Gene product per description Descriptions Gene products 1 25119 25119 2-10 13548 52731 11-50 4499 105762 51-100 1296 93586 101-500 2118 492847 501-1000 545 377029 1000-77069 431 1876746 3 million gene products (UniProt) described by 47636descriptions EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

  9. GOA Evidence Description Code Inferred by Curator IC experimental Inferred from Direct Assay IDA experimental Inferred from Electronic Annotation IEA computational Inferred from Expression Pattern IEP experimental Inferred from Genetic Interaction IGI experimental Inferred from Mutant Phenotype IMP experimental Inferred from Physical Interaction IPI experimental Inferred from Sequence or Structural Similarity ISS computational Non-traceable Author Statement NAS experimental No biological Data available ND -- Inferred from Reviewed Computational Analysis RCA computational Traceable Author Statement TAS experimental Not Recorded NR -- EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

  10. GOA Evidence Description Code Inferred by Curator IC exp 388 Inferred from Direct Assay IDA exp 10888 Inferred from Electronic Annotation IEA comp 15419002 99,5% Inferred from Expression Pattern IEP exp 392 Inferred from Genetic Interaction IGI exp 145 Inferred from Mutant Phenotype IMP exp 1646 Inferred from Physical Interaction IPI exp 7517 Inferred from Sequence or Structural Similarity ISS comp 16759 Non-traceable Author Statement NAS exp 10811 No biological Data available ND -- 3386 Inferred from Reviewed Computational Analysis RCA comp 107 Traceable Author Statement TAS exp 19463 Not Recorded NR -- 1185 total associations 13402670 15491689 100% EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

  11. GOA p-value The number of gene products associated to a term or any of its children, divided by the number of total associations between the GO terms and gene products. The smaller the p(term) is the higher the information content and the more detailed the description. name level term-type count p(term) molecular_function 1 molecular_function 3471081 0.477303 biological_process 1 biological_process 2243629 0.308518 cellular_component 1 cellular_component 1864423 0.256374 hormone activity 4 molecular_function 4504 0.000619338 Gliogenesis 4 biological_process 95 1.30633e-05 cell fate specification 5 biological_process 204 2.80517e-05 Angiogenesis 7 biological_process 363 4.99156e-05 EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

  12. GOA delta p-value EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

  13. GOA p-value One would expect a linear increase of the information content along a path Re-evaluate annotaions and GO term choise according such studies EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

  14. GO - GOA Important knowledge to understand better biological data Urgent need to collect and incoorporate existent information especially from non-model organisms THANKS!!!!!! EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

More Related