160 likes | 303 Views
Maximal Extraction of Biological Information from Genetic Interaction Data. Greg Carter Galitski Lab Institute for Systems Biology (Seattle). Genetic Interaction. Pairwise perturbation two genes combine to affect phenotype Hereford & Hartwell 1974 Measure a phenotype for 4 strains:
E N D
Maximal Extraction of Biological Information from Genetic Interaction Data Greg Carter Galitski Lab Institute for Systems Biology (Seattle)
Genetic Interaction Pairwise perturbation two genes combine to affect phenotype Hereford & Hartwell 1974 Measure a phenotype for 4 strains: • Wild-type reference genotype • Perturbation of gene A • Perturbation of gene B • Double perturbation of A and B • Loss-of-function, gain-of-function, dominant-negative, etc. • Interaction depends on phenotype measured.
Invasion Assay pre-wash post-wash WT flo11 sfl1 flo11sfl1 Genetic Interaction Example: flo11 and sfl1 for yeast invasion. ~2000 interactions measured (Drees et al, 2005)
Classification of Interactions 45 possible phenotype inequalities Classified into 9 rules (Drees, et al. 2005) WT=A=B=AB, WT=A<B=AB, A=B=WT<AB, A<B<WT=AB, AB<A<WT=B, WT=A=AB<B, WT=A=AB<B, A<B<WT<AB, etc…
Distribution of Rules Yeast Invasion Network 2000 interactions among 130 genes
Extracting Biological Statements Statistical associations of a gene interacting with a function PhenotypeGenetics plug-in for Cytoscape www.cytoscape.org
Classification Problem Can the 45 interactions be classified in a more informative way? How many rules? Distribution of interactions? ? WT=A=B=AB, WT=A<B=AB, A=B<WT<AB, A<B<WT=AB, AB<A<WT=B, WT=A=AB<B, WT=A=AB<B, A<B<WT<AB, etc…
Requirements for a complexity metric Y: • Adding a gene with random interactions adds no information • Duplicating a gene adds no information • Should depend on • (i) the information content of each gene’s interactions, and • (ii) the information content of all gene-gene relationships. • General requirements for biological information (see poster). Context-dependent Complexity
Y = S Kimij (1 – mij ) • Ki is the information of node i, • mijis the mutual information betweeniandj, • 0 ≤ mij ≤ 1 • and • 0 ≤Y≤ 1 • Applied to (see poster): • Sets of bit strings (sequences) • Network architecture • Dynamic Boolean networks • Genetic interaction networks… pairs ij pij(a,b) pi (a)pj (b) Context-dependent Complexity Example: Shannon mutual information mij = Spij(a,b) log( ) i=a, j=b
Genetic Interaction Networks • Invasion network of Drees, et al. Genome Biology 2005 130 genes, 2000 interactions • MMS fitness network of St Onge, et al. Nature Genetics 2007 26 genes, 325 interactions Determined networks of maximum complexity Y.
Complexity and Biological Information Number of biological statements is correlated with Y 115k possible MMS fitness networks, r = 0.80
Genetic Interaction Networks Maximally complex MMS fitness network
Genetic Interaction Networks Biological statements from the maximally complex MMS fitness network St Onge, et al. Figure 5d
Conclusion and Future Work For a given data set, maximizing Y facilitates unsupervised, maximal information extraction by balancing over-generalized and over-specific classifications schemes. Need network-based methods to interpret the maximally complex interaction rules. Interpretations will depend on the system, specific to phenotype measured and perturbations performed. See poster for more details
Thanks to Becky Drees Alex Rives Marisa Raymond Iliana Avila-Campillo Paul Shannon James Taylor Susanne Prinz Vesteinn Thorsson Tim Galitski Matti Nykter Nathan Price Ilya Shmulevich David Galas