130 likes | 225 Views
Annotation/ontology consistency checking. J. Deegan, C. Mungall Cambridge, 2009. Checking file. [Term] id: GO:0007595 name: lactation relationship: only_in_taxon NCBITaxon:40674 ! Mammalia [Term] id: GO:0048827 name: phyllome development
E N D
Annotation/ontology consistency checking J. Deegan, C. Mungall Cambridge, 2009
Checking file [Term] id: GO:0007595 name: lactation relationship: only_in_taxon NCBITaxon:40674 ! Mammalia [Term] id: GO:0048827 name: phyllome development relationship: only_in_taxon NCBITaxon:33090 ! Viridiplantae 624 rules so far.
Chris’s MOOSE checking script: Script run and reporting can be automated. 5660 inconsistencies found: • Confusing ontology structure leading to annotation inconsistency. • Some pipeline problems. • Some taxon check changes needed.
Unclear ontology terms – Virus annotation to descendents of ‘cellular’ terms:
Unclear ontology definitions improved: [i]cognition ---[i]sensory perception <– Bacterial annotations old def:The series of events required for an organism to receive a sensory stimulus, convert it to a molecular signal, and recognize and characterize the signal. new def: The series of events required for an organism to receive a sensory stimulus, convert it to a molecular signal in the nervous system, and recognize and characterize the signal.
Mammal annotations to (plant) senescence “A pre-programmed process associated with the dismantling of an anatomical structure and an overall decline in metabolism. This may include the breakdown of organelles, membranes and other cellular components.” • Definition improved • Cell aging terms also need work
IEA pipeline problems • e.g. Drosophila annotation to "photosynthesis, light reaction" • InterPro mappings are correct but… • False positives in domain matching. • Pipeline will be corrected.
Annotation inconsistencies corrected: • Bacterial ISS annotation to “mitochondrion”. • Bacterial IEA annotation to “nucleus”. • Non-mammal ISS to “in utero embryonic development” • Non-mammal ISS to “ovarian follicle development” • Non-mammal ISS annotation to “lactation”
Exception: • "photosynthetic electron transport in photosystem II“ should not normally be used for viruses. • Viral genes psbA and psbD from bacteriophage S-PM2 , encode the D1 and D2 core components of the photosynthetic reaction center PSII (photosystem II).
Fixes documented at • http://gocwiki.geneontology.org/ index.php/Taxon_Main_Page
Questions: • Should we email check results once a month? • Some problems cannot immediately be fixed. Should the data be released anyway?
Exceptions • Should we make rules that are *generally* applicable? • There will be correct exceptions that do not pass the rules. • How do we document these?