50 likes | 193 Views
Phylogenetic prediction of gene function. Daniel Barker Centre for Evolution, Genes and Genomics, School of Biology, University of St Andrews http://biology.st-andrews.ac.uk/cegg. Correlations in gene gain/loss. Gene A 1 0 1 0 1 0 1 0
E N D
Phylogenetic prediction of gene function Daniel Barker Centre for Evolution, Genes and Genomics, School of Biology, University of St Andrews http://biology.st-andrews.ac.uk/cegg
Correlations in gene gain/loss Gene A 1 0 1 0 1 0 1 0 Gene B 1 0 1 0 1 0 1 0 Gene C 1 1 1 1 0 0 0 0 Gene D 1 1 1 1 0 0 0 0 sp. viii sp. vii sp. vi sp. iv sp. iii sp. v sp. ii sp. i
Example ORC3 L42B CIN4 L9A 0 0 0 1 Caenorhabditis elegans 0 0 0 1 Drosophila melanogaster 0 0 0 1 Cryptococcus neoformans 0 0 0 1 Aspergillus nidulans 0 0 1 1 Magnaporthe grisea 0 0 0 1 Neurospora crassa Fusarium graminearum 0 0 0 1 Schizosaccharomyces pombe 0 1 1 1 Candida albicans 1 1 0 0 Saccharomyces kluyveri 1 1 0 0 S. castellii 1 1 0 0 1 1 0 0 S. bayanus S. mikatae 1 1 0 0 0.1 changes per nucleotide S. paradoxus 1 1 1 1 S. cerevisiae 1 1 1 1 The ‘traditional’ across-species method of phylogenetic profiles (Pellegrini et al. 1999, PNAS96: 4285–7288) returns a false positive functional link for the pair of proteins {CIN4, ORC3} and a false negative for the pair {L9A, L42B}. The tree-based, maximum likelihood phylogenetic method returns the ‘correct’ result for both pairs (Barker & Pagel 2005, PLoS Comp Biol1:24–31).
Validation 100 90 80 70 60 % specificity 50 40 30 0.0002 0.0008 20 0.0009 0.0003 0.5 0.2 1.0 0.1 0.01 0.0006 0.001 0.05 0.005 0.0007 0.0005 0.0004 p-value cut-off Test data based on known yeast protein complexes In the Comprehensive Yeast Genome Database (http://mips.gsf.de/genre/proj/yeast)
Acknowledgements University of Reading: Mark Pagel Andrew Meade Wellcome Trust Sanger Institute: Valerie Wood (UCL): Antonio Cavallo Funding: BBSRC EPSRC on behalf of Research Councils UK