250 likes | 368 Views
IMG clusters – the hidden features. Sean Hooper Genome Biology Program JGI. Clusters work behind the scenes in IMG Used for Data compression Annotation assistance Grouping of similar functions Necessary for large datasets, e.g. metagenomics. Background. Example.
E N D
IMG clusters – the hidden features Sean Hooper Genome Biology Program JGI
Clusters work behind the scenes in IMG Used for Data compression Annotation assistance Grouping of similar functions Necessary for large datasets, e.g. metagenomics Background
Example • Search for a gene annotated as putative or hypothetical • Study the often overlooked clusters of genes in IMG
COG Pfam IMG
1997: 720 cogs 2003: 4873 cogs Tatusev et al 1997
COG Pfam IMG
COG Pfam IMG
Nodes = IMG genes Edges = in same cluster
Provide fast access to related proteins Ease analysis and annotation (but cannot replace experimental work) Reveal substructures in function and phylogeny Conclusions
Acknowledgements Chalmers, Sweden D Dalevi Genome Biology K Mavrommatis IJ Anderson NC Kyrpides A Pati IMG crew K Palappian E Szeto VK Markowitz
Cluster overview of Archaea Spectral bipartitioning Integrate metadata (phenotype, phylogeny) COAL demo