60 likes | 191 Views
Importing GO terms from UniProt to a PGDB. Markus Krummenacker Bioinformatics Research Group SRI International kr@ai.sri.com. GO in EcoCyc Introduction. GO ( http://geneontology.org ) is used widely to annotate gene products with functions, processes, and cellular locations
E N D
Importing GO terms from UniProt to a PGDB Markus Krummenacker Bioinformatics Research Group SRI International kr@ai.sri.com
GO in EcoCyc Introduction • GO (http://geneontology.org) is used widely to annotate gene products with functions, processes, and cellular locations • Manual curation of GO annotations in EcoCyc:
UniProtKB GO annotations • GO consortium hosts UniProtKB annotations file • Big, several GB. grep file for E. coli taxon ID • Import code maps UniProtKB IDs to EcoCyc gene products, via DBLINKs of the products • Most imported GO annots have comp. evidence • Comp. ev. annots get timestamps bumped up (because they expire after 1 yr.) • Suppress comp. ev. annots if redundant with an existing exp. ev. annot • Prune comp. ev. annots if a more specific annot of the same kind exists (several dozens)
EcoliWiki – EcoCyc collaboration • Collaboration with Jim Hu / EcoliWiki • Workflow: • GO UniProtKB EcoCyc • EcoCyc exports GO annots file • EcoCyc GO annots EcoliWiki • Merging of EcoCyc and additional EcoliWiki annots • EcoliWiki GO consortium, deposit file for E. coli • Annots are absorbed into UniProtKB • Repeat in half a year
Open Issues • Round-trip problem of deleted annots • EcoCyc curator deletes an annot, because wrong • EcoliWiki should detect this. Protocol not clear yet. • For now: UniProtKB import into EcoCyc checks history logs, to prevent annot addition if that annot was deleted in the past • No EcoCyc support yet for some qualifiers: • NOT • Contributes_to • No easy user interface yet for annot import
Do it Yourself • Disclaimer: Has never been tried outside of EcoCyc • Prepare input file (using grep). DBLINKS need to exist on gene products. • (add-go-terms-to-monomers (incorporate-ecocyc-go-terms-from-GOAFF-file :filename “…../gene_association.goa” :db-type ‘UNIPROT) ) (save-kb) (loop for p in (all-frames-that-could-contain-go-annots) do (prune-unnecessary-go-terms p :destructively-prune! t))