700 likes | 845 Views
Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks. Dirk Husmeier Adriano V. Werhli. +. +. +. +. +. +. +. …. Learning Bayesian networks. from data and prior knowledge. Bayesian networks.
E N D
Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli
+ +
+ + + + …
Learning Bayesian networks from data and prior knowledge
Bayesian networks • Marriage between graph theory and probability theory. • Directed acyclic graph (DAG) representing conditional independence relations. • It is possible to score a network in light of the data. • We can infer how well a particular network explains the observed data. NODES A B C EDGES D E F
Bayesian networks versus causal networks Node A unknown A A True causal graph B C B C
Bayesian networks versus causal networks A A A B C B C B C • Equivalence classes: networks with the same scores. • Equivalent networks cannot be distinguished in light of the data. • We can only learn the undirected graph. • Unless… we use interventions or prior knowledge. A B C
Learning Bayesian networks from data P(M|D) = P(D|M) P(M) / Z
Biological prior knowledge matrix Indicates some knowledge about the relationship between genes i and j Biological Prior Knowledge
Biological prior knowledge matrix Indicates some knowledge about the relationship between genes i and j Biological Prior Knowledge Define the energy of a Graph G
Energy of a network Prior distribution over networks
Sample networks and hyperparameters • from the posterior distribution • Capture intrinsic inference uncertainty • Learn the trade-off parameters automatically P(M|D) = P(D|M) P(M) / Z
Energy of a network Prior distribution over networks
Energy of a network Rewriting the energy
Energy of a network Rewriting the energy
Sample networks and hyperparameters from the posterior distribution Proposal probabilities Metropolis-Hastings scheme
MCMC with one prior Sample graph and the parameter b. • Separate in two samples to improve the acceptance: • Sample graph with b fixed. • Sample b with graph fixed.
MCMC with one prior Sample graph and the parameter b. • Separate in two samples to improve the acceptance: • Sample graph with b fixed. • Sample b with graph fixed. BGe BDe
MCMC with one prior Sample graph and the parameter b. • Separate in two samples to improve the acceptance: • Sample graph with b fixed. • Sample b with graph fixed. BGe BDe
MCMC with one prior Sample graph and the parameter b. • Separate in two samples to improve the acceptance: • Sample graph with b fixed. • Sample b with graph fixed. BGe BDe
MCMC with one prior Sample graph and the parameter b. • Separate in two samples to improve the acceptance: • Sample graph with b fixed. • Sample b with graph fixed. BGe BDe
MCMC with two priors Sample graph and the parameters b1and b2 • Separate in three samples to improve the acceptance: • Sample graph with b1 and b2 fixed. • Sample b1 with graph and b2 fixed. • Sample b2 with graph and b1 fixed.
Bayesian networkswith biological prior knowledge • Biological prior knowledge: Information about the interactions between the nodes. • We use two distinct sources of biological prior knowledge. • Each source of biological prior knowledge is associated with its own trade-off parameter:b1 and b2. • The trade off parameter indicates how much biological prior information is used. • The trade-off parameters are inferred. They are not set by the user!
Bayesian networkswith two sources of prior Source 2 Source 1 Data BNs + MCMC b1 b2 Recovered Networks and trade off parameters
Bayesian networkswith two sources of prior Source 2 Source 1 Data BNs + MCMC b1 b2 Recovered Networks and trade off parameters
Bayesian networkswith two sources of prior Source 2 Source 1 Data BNs + MCMC b1 b2 Recovered Networks and trade off parameters
Evaluation • Can the method automatically evaluate how useful the different sources of prior knowledge are? • Do we get an improvement in the regulatory network reconstruction? • Is this improvement optimal?
Raf regulatory network From Sachs et al Science 2005
Evaluation: Raf signalling pathway • Cellular signalling network of 11 phosphorylated proteins and phospholipids in human immune systems cell • Deregulation carcinogenesis • Extensively studied in the literature gold standard network
Flow cytometry data and KEGG • Intracellular multicolour flow cytometry. • Measured protein concentrations. • 11 proteins: 1200 concentration profiles. • We sample 5 separate subsets with 100 concentration profiles each.
Microarray example Spellman et al (1998) Cell cycle 73 samples Tu et al (2005) Metabolic cycle 36 samples time time Genes Genes
Flow cytometry data and KEGG http://www.genome.jp/kegg/ KEGG PATHWAYS are a collection of manually drawn pathway maps representing our knowledge of molecular interactions and reaction networks.
The data and the priors + KEGG + Random